000190464 001__ 190464
000190464 005__ 20190316235742.0
000190464 0247_ $$2doi$$a10.1016/j.jcp.2009.06.041
000190464 022__ $$a0021-9991
000190464 02470 $$2ISI$$a000273389500001
000190464 037__ $$aARTICLE
000190464 245__ $$aNodal discontinuous Galerkin methods on graphics processors
000190464 269__ $$a2009
000190464 260__ $$bElsevier$$c2009
000190464 336__ $$aJournal Articles
000190464 520__ $$aDiscontinuous Galerkin (DG) methods for the numerical solution of partial differential equations have enjoyed considerable success because they are both flexible and robust: They allow arbitrary unstructured geometries and easy control of accuracy without compromising simulation stability. Lately, another property of DG has been growing in importance: The majority of a DG operator is applied in an element-local way, with weak penalty-based element-to-element coupling. The resulting locality in memory access is one of the factors that enables DG to run on off-the-shelf, massively parallel graphics processors (GPUs). In addition, DG's high-order nature lets it require fewer data points per represented wavelength and hence fewer memory accesses, in exchange for higher arithmetic intensity. Both of these factors work significantly in favor of a GPU implementation of DG. Using a single US$400 Nvidia GTX 280 GPU, we accelerate a solver for Maxwell's equations on a general 3D unstructured grid by a factor of around 50 relative to a serial computation on a current-generation CPU. In many cases, our algorithms exhibit full use of the device's available memory bandwidth. Example computations achieve and surpass 200 gigaflops/s of net application-level floating point work. In this article, we describe and derive the techniques used to reach this level of performance. In addition, we present comprehensive data on the accuracy and runtime behavior of the method. (C) 2009 Elsevier Inc. All rights reserved.
000190464 6531_ $$aDiscontinuous Galerkin
000190464 6531_ $$aHigh order
000190464 6531_ $$aGPU
000190464 6531_ $$aParallel computation
000190464 6531_ $$aMany-core
000190464 6531_ $$aMaxwell's equations
000190464 700__ $$aKloeckner, A.
000190464 700__ $$aWarburton, T.
000190464 700__ $$aBridge, J.
000190464 700__ $$g232231$$aHesthaven, Jan S.$$0247428
000190464 773__ $$j228$$tJournal of Computational Physics$$k21$$q7863-7882
000190464 8564_ $$uhttps://infoscience.epfl.ch/record/190464/files/JCP2009.pdf$$zPostprint$$s900503$$yPostprint
000190464 909C0 $$xU12703$$0252492$$pMCSS
000190464 909CO $$qGLOBAL_SET$$pSB$$ooai:infoscience.tind.io:190464$$particle
000190464 917Z8 $$x232231
000190464 937__ $$aEPFL-ARTICLE-190464
000190464 970__ $$aKloeckner2009/MCSS
000190464 973__ $$rREVIEWED$$sPUBLISHED$$aOTHER
000190464 980__ $$aARTICLE