Today’s batch-job schedulers for HPC systems lead to an execution of jobs with a static resource allocation over their runtime. Despite this static resource assignment being omnipresent on HPC systems, this comes with two major drawbacks:
We suggest addressing these issues by applications adapting to changing resources and a resource manager (RM) which optimizes the resource distribution towards improved application throughput. Despite our framework is capable of being executed on distributed-memory systems, we only consider Invasive Computing for shared-memory systems here as a proof of concept. This was partly developed in collaboration with the research group of Prof. Dr. M. Gerndt, see e.g. [GHM+12].
In this work, each application is assumed to be parallelized with OpenMP or TBB. The required extensions for Invasive Computing on these parallelization models are discussed in Sec. 8.1. The invasive client layer offers the API to request changing resources and to update the resource requirements if requested by the resource manager, see Sec. 8.2. Scheduling of the resources is based on information provided by the invasive applications with the scheduling decisions evaluated in the RM which is described in Sec. 8.3.