These html pages are based on the PhD thesis "Cluster-Based Parallelization of Simulations on Dynamically Adaptive Grids and Dynamic Resource Management" by Martin Schreiber.
There is also more information and a PDF version available.

8.5 Invasive programming patterns

Using Invasive Computing on shared-memory systems requires implementation of invasive interfaces at particular positions in the code. This section presents the invasive programming pattern for iteration-based DAMR simulations and shows required extensions for simulations which rely on cached thread associativity.

8.5.1 Iteration-based simulation

We assume that our application uses an iterative setup (initial refinement) and an iterative time-stepping scheme (simulation) with the pseudo code given in the left column of Table 8.2.




Standard simulation steps Invasive simulation steps




; 
 
gridRefinement = True 
while gridRefinement: 
  ; 
  gridRefinement = setupGrid(); 
 
for i in timesteps: 
  ; 
  timestepAndAdaptivityTraversals(); 
 
;

setup(); 
 
gridRefinement = True 
while gridRefinement: 
  invade_nonblocking([constr.]); 
  gridRefinement = setupGrid(); 
 
for i in timesteps: 
  (re)invade_(non)blocking([constr.]); 
  timestepAndAdaptivityTraversals(); 
 
shutdown();




Table 8.2: Comparison of pseudo code for our simulation based on a time stepping and setup loop (left) with an invasified version (right). This invasified version only works for code without owner-compute scheme.

An extension with our Invasive Computing API then requires the following interfaces:

8.5.2 Iteration-based with owner-compute

Applying the Invasive Computing programming pattern to our owner-compute scheme (see Section 5.6.1) can lead to invalid cluster-tree traversals. The reason can be found in the caching of the thread id range on each node owning one of the leave nodes in the cluster tree. With dynamically changing numbers of active threads in a program, this thread id can be associated to a thread which does not exist anymore. In case of a shrinking number of resources, this leads to missing traversals of nodes in the cluster tree. For a growing number of resources, one or more cores remain unused. Hence, we additionally require to account for these cached thread ids and show the required extension in the pseudo code presented in Table 8.3.



setup(); 
 
for i in timesteps: 
  // return new number of threads 
  new_number_of_threads = 
    (re)invade_(non)blocking([constraints], postponeThreadUpdate); 
 
  if new_number_of_threads < old_threads:   // shrink 
    updateCachedThreadIDs(new_number_of_threads); 
    updateNumberOfRunningThreads(new_number_of_threads); 
 
  elif new_number_of_threads > old_threads: // grow 
    updateNumberOfRunningThreads(new_number_of_threads); 
    updateCachedThreadIDs(new_number_of_threads); 
 
  old_threads = new_number_of_threads; 
 
  timestepAndAdaptivityTrav(); 
 
shutdown();



Table 8.3: Pseudo code for changing number of threads with an owner-compute scheme.

Here, the function (re)invade_(non)blocking also returns the new number of threads and the additional constraint postponeThreadUpdate requests, that the number of threads are not directly updated with the invade call. Instead, an additional function updateNumberOfRunningThreads is provided which has to be executed for updating the threads based on the last reinvade call.

In case that the number of threads was decreased, we then update the cached thread ids in the cluster tree (updateCachedThreadIDs), followed by updating the number of actively running threads including their pinning. For a growing number of threads, we can directly update the number of running threads followed by updating the cached thread ids.