Towards the optimization of a gyrokinetic Particle-In-Cell (PIC) code on large-scale hybrid architectures
With the aim of enabling state-of-the-art gyrokinetic PIC codes to benefit from the performance of recent multithreaded devices, we developed an application from a platform called the "PIC-engine" [1, 2, 3] embedding simplified basic features of the PIC method. The application solves the gyrokinetic equations in a sheared plasma slab using B-spline finite elements up to fourth order to represent the self-consistent electrostatic field. Preliminary studies of the so-called Particle-In-Fourier (PIF) approach, which uses Fourier modes as basis functions in the periodic dimensions of the system instead of the real-space grid, show that this method can be faster than PIC for simulations with a small number of Fourier modes. Similarly to the PIC-engine, multiple levels of parallelism have been implemented using MPI+OpenMP [2] and MPI+OpenACC [1], the latter exploiting the computational power of GPUs without requiring complete code rewriting. It is shown that sorting particles [3] can lead to performance improvement by increasing data locality and vectorizing grid memory access. Weak scalability tests have been successfully run on the GPU-equipped Cray XC30 Piz Daint (at CSCS) up to 4,096 nodes. The reduced time-to-solution will enable more realistic and thus more computationally intensive simulations of turbulent transport in magnetic fusion devices.
2016
775
1
This work is partly funded by the Swiss National Science Foundation and the Platform for Advanced Scientific Computing. It has been carried out within the framework of the EUROfusion Consortium and has received funding from the Euratom research and training programme 2014-2018 under grant agreement No 633053. The views and opinions expressed herein do not necessarily reflect those of the European Commission.
REVIEWED
EPFL
Event name | Event place | Event date |
Varenna, Italy | 29 August - 02 September 2016 | |