8 Invasive Computing for shared-memory HPC systems

These html pages are based on the PhD thesis "Cluster-Based Parallelization of Simulations on Dynamically Adaptive Grids and Dynamic Resource Management" by Martin Schreiber.
There is also more information and a PDF version available.

[next] [prev] [prev-tail] [tail] [up]

Chapter 8
Invasive Computing for shared-memory HPC systems

Today’s batch-job schedulers for HPC systems lead to an execution of jobs with a static resource allocation over their runtime. Despite this static resource assignment being omnipresent on HPC systems, this comes with two major drawbacks:

First, starting applications is not possible in case of insufficient resources available. This is due to compute resources used by other applications which cannot be reassigned to other applications without stopping them. This lack of resource reallocation in the applications then leads to idling resources until the remaining amount of requested resources becomes available.
Second, for concurrently executed applications with unforeseeable changing resource requirements as it is the case for our simulations on dynamically changing grids, a static resource allocation is obviously unable to cope with a changing resource requirement.

We suggest addressing these issues by applications adapting to changing resources and a resource manager (RM) which optimizes the resource distribution towards improved application throughput. Despite our framework is capable of being executed on distributed-memory systems, we only consider Invasive Computing for shared-memory systems here as a proof of concept. This was partly developed in collaboration with the research group of Prof. Dr. M. Gerndt, see e.g. [GHM⁺12].

In this work, each application is assumed to be parallelized with OpenMP or TBB. The required extensions for Invasive Computing on these parallelization models are discussed in Sec. 8.1. The invasive client layer offers the API to request changing resources and to update the resource requirements if requested by the resource manager, see Sec. 8.2. Scheduling of the resources is based on information provided by the invasive applications with the scheduling decisions evaluated in the RM which is described in Sec. 8.3.

8.1 Invasion with OpenMP and TBB
8.2 Invasive client layer
  8.2.1 Constraints
  8.2.2 Communication to resource manager
  8.2.3 Invasive Computing API
8.3 Invasive resource manager
8.4 Scheduling decisions
8.5 Invasive programming patterns
  8.5.1 Iteration-based simulation
  8.5.2 Iteration-based with owner-compute
8.6 Results
  8.6.1 Micro benchmarks of invasive overheads
  8.6.2 Dynamic resource redistribution with scalability graphs
  8.6.3 Invasive Tsunami parameter studies

[next] [prev] [prev-tail] [front] [up]

Chapter 8Invasive Computing for shared-memory HPC systems

Chapter 8
Invasive Computing for shared-memory HPC systems