These html pages are based on the
PhD thesis "Cluster-Based Parallelization of Simulations on Dynamically Adaptive Grids and Dynamic Resource Management" by Martin Schreiber.
There is also
more information and a PDF version available.
8.5 Invasive programming patterns
Using Invasive Computing on shared-memory systems requires implementation of invasive interfaces at
particular positions in the code. This section presents the invasive programming pattern for
iteration-based DAMR simulations and shows required extensions for simulations which rely on cached
thread associativity.
8.5.1 Iteration-based simulation
We assume that our application uses an iterative setup (initial refinement) and an iterative
time-stepping scheme (simulation) with the pseudo code given in the left column of Table
8.2.
|
|
Standard simulation steps | Invasive simulation steps |
|
|
|
|
; gridRefinement = True while gridRefinement: ; gridRefinement = setupGrid(); for i in timesteps: ; timestepAndAdaptivityTraversals(); ; | setup(); gridRefinement = True while gridRefinement: invade_nonblocking([constr.]); gridRefinement = setupGrid(); for i in timesteps: (re)invade_(non)blocking([constr.]); timestepAndAdaptivityTraversals(); shutdown(); |
|
|
|
Table 8.2: Comparison of pseudo code for our simulation based on a time stepping and setup
loop (left) with an invasified version (right). This invasified version only works for code without
owner-compute scheme.
An extension with our Invasive Computing API then requires the following interfaces:
- setup():
The setup routine registers the program at the invasive RM. In case of no resource
available, this call blocks until at least one resource can be assigned to the application.
- invade_(non)blocking(constraints):
Different variants of the invasive command space can be used for updating the resources
with the forwarding of constraints.
- reinvade_(non)blocking():
Different variants of the reinvade are used for updating the resources without updating
the already forwarded constraints.
- shutdown():
After the simulation is finished, the RM is informed about the shutdown of the application,
making the resources available to other applications.
8.5.2 Iteration-based with owner-compute
Applying the Invasive Computing programming pattern to our owner-compute scheme (see
Section 5.6.1) can lead to invalid cluster-tree traversals. The reason can be found in the caching
of the thread id range on each node owning one of the leave nodes in the cluster tree.
With dynamically changing numbers of active threads in a program, this thread id can be
associated to a thread which does not exist anymore. In case of a shrinking number of
resources, this leads to missing traversals of nodes in the cluster tree. For a growing number of
resources, one or more cores remain unused. Hence, we additionally require to account for these
cached thread ids and show the required extension in the pseudo code presented in Table
8.3.
|
setup(); for i in timesteps: // return new number of threads new_number_of_threads = (re)invade_(non)blocking([constraints], postponeThreadUpdate); if new_number_of_threads < old_threads: // shrink updateCachedThreadIDs(new_number_of_threads); updateNumberOfRunningThreads(new_number_of_threads); elif new_number_of_threads > old_threads: // grow updateNumberOfRunningThreads(new_number_of_threads); updateCachedThreadIDs(new_number_of_threads); old_threads = new_number_of_threads; timestepAndAdaptivityTrav(); shutdown();
|
|
|
Table 8.3: Pseudo code for changing number of threads with an owner-compute scheme.
Here, the function (re)invade_(non)blocking also returns the new number of threads and the
additional constraint postponeThreadUpdate requests, that the number of threads are not directly
updated with the invade call. Instead, an additional function updateNumberOfRunningThreads is
provided which has to be executed for updating the threads based on the last reinvade
call.
In case that the number of threads was decreased, we then update the cached thread ids in the
cluster tree (updateCachedThreadIDs), followed by updating the number of actively running threads
including their pinning. For a growing number of threads, we can directly update the number of
running threads followed by updating the cached thread ids.