These html pages are based on the PhD thesis "Cluster-Based Parallelization of Simulations on Dynamically Adaptive Grids and Dynamic Resource Management" by Martin Schreiber.
There is also more information and a PDF version available.

Chapter 5
Parallelization

The parallelization of inherently serial code as it is the case for the Sierpiński SFC grid traversal with its stack- and stream-based data management is a challenging task. This chapter is on the extension of the so far serial grid traversal to a cluster-based parallelization approach.

 5.1 SFC-based parallelization methods for DAMR
  5.1.1 SFC-based domain partitioning
  5.1.2 Shared- and replicated-data scheme
  5.1.3 Partition scheduling
 5.2 Inter-partition communication and dynamic meta information
  5.2.1 Grid traversals with replicated data layout
  5.2.2 Properties of SFC-based inter-partition communication
  5.2.3 Meta information for communication
  5.2.4 Vertices uniqueness problem
  5.2.5 Exchanging communication data and additional stacks
  5.2.6 Dynamic updating of run-length-encoded adjacency information
 5.3 Parallelization with clusters
  5.3.1 Cluster definition
  5.3.2 Cluster-based framework design
  5.3.3 Cluster set
  5.3.4 Cluster unique ids
 5.4 Base domain triangulation and initialization of meta information
  5.4.1 Initial communication meta information
 5.5 Dynamic cluster generation
  5.5.1 Splitting
  5.5.2 Joining
  5.5.3 Split and join updates of meta communication information
  5.5.4 Reconstruction of vertex communication meta information
 5.6 Shared-memory parallelization
  5.6.1 Scheduling strategies
  5.6.2 Cluster generation strategies
  5.6.3 Threading libraries
 5.7 Results: Shared-memory parallelization
 5.8 Cluster-based optimization
  5.8.1 Single reduce operation for replicated data
  5.8.2 Skipping of traversals on clusters with a conforming state
  5.8.3 Improved memory consumption with RLE meta information
 5.9 Results: Long-term simulations and optimizations on shared-memory
 5.10 Distributed-memory parallelization
  5.10.1 Intra- and inter-cluster communication
  5.10.2 Dynamic cluster generation
  5.10.3 Cluster-based load balancing
  5.10.4 Distributed base triangulation
  5.10.5 Similarities with parallelization of block-adaptive grids
 5.11 Hybrid parallelization
 5.12 Results: Distributed-memory parallelization
  5.12.1 Small-scale distributed-memory scalability studies
  5.12.2 Large-scale distributed memory strong-scalability studies
 5.13 Summary and Outlook