Files

Abstract

Gyrokinetic simulations are computationally extremely demanding due to the high dimensionality of the physical phase space and the interplay between plasma particles and electromagnetic fields. It is thus essential to make full use of the available numerical resources to be able to simulate more complex physical problems. With the aim of optimizing the gyrokinetic Particle-In-Cell code ORB5 towards exascale computing, a particle sorting method is implemented to increase data locality. Furthermore, different algorithms are used to improve vectorization, and the MPI parallelization is complemented with OpenMP. More specifically, we shall focus on the particle to grid operations involved in the PIC charge deposition step. The latter is critical to parallelize using a shared memory paradigm due to the scatter operations involved. We will present the different algorithms and parallelization schemes implemented in the ORB5 charge deposition step and how they affect the speedup compared to the base MPI case.

Details

PDF