With the knowledge about algorithms from the area of scientific computing, we developed numerical core algorithms in invadeX10. Such algorithms are a multigrid solver with changing workload due to restrictions and prolongations, a dynamic adaptive quadrature based on recursion and lightweight tasks and a Peano-SFC-based matrix-matrix multiplication [Bad08].
Since the multigrid solver with its multi-resolution access is one of the most interesting algorithms for invasion, we selected this solver to highlight the required changes in the program structure and like to refer to [TOS00] for a detailed introduction to such multigrid solvers.
As a representative application scenario, we selected a laser engraving symbols on a two-dimensional metal plate, see Fig. 7.2. This process can be simulated with the discretization of the heat equation which is discussed next, followed by a brief introduction to the multigrid algorithm which is then used to solve the system of equations.
Let the change in heat distribution over time for a two-dimensional problem be given by
Using a Jacobi solver, a single iteration mainly smooths the high-frequency errors, only. In case of low-frequency errors, multigrid solvers are commonly used. Here, we use the error-correction scheme of the multigrid solvers which restricts the residual := - A* successively to coarser levels. Then, on each coarser level, we apply a Jacobi smoother iteration to compute the residual and restrict the residual to the coarser level. Each restriction operation then results in a reduction of the workload on each level by a factor of 4. After the execution of the smoother on the coarsest level2 , the computed error-correction is successively prolongated to the higher-resolved levels and an additional smoother iteration is executed before prolongating it to the next level.
With our heat equation and with a size of the simulation domain of 128 × 128, 7 levels with different resolutions are used. Since each level has a changing workload, this also results in a dynamical resource requirement.
For the parallelization, we use the distributed arrays from X10 to store the approximated solution of the iteration, the right side and the residual for each level.
Next, we compare a pseudo code of a non-invasive and an invasified multigrid algorithm in Fig. 7.1. To show the applicability of the dynamical resource management, we shrink the number of compute resources during the restriction.
Based on the invadeX10 framework, each v-cycle iteration is extended with the claim as a parameter which describes the currently invaded processing elements. The resources in these claims can be dynamically changed with a reinvade, based on the number of slices of the solution array on the currently restricted multigrid level. In case that a reinvade led to a change of resources (nc != nc2), also the distributed arrays are redistributed to improve the locality to the computing resources.