Parallel algorithms and efficient implementation techniques for finite element approximations

Popescu, Radu

doi:10.5075/epfl-thesis-5980

Popescu, Radu

2013

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In this thesis we study the efficient implementation of the finite element method for the numerical solution of partial differential equations (PDE) on modern parallel computer archi- tectures, such as Cray and IBM supercomputers. The domain-decomposition (DD) method represents the basis of parallel finite element software and is generally implemented such that the number of subdomains is equal to the number of MPI processes. We are interested in breaking this paradigm by introducing a second level of parallelism. Each subdomain is assigned to more than one processor and either MPI processes or multiple threads are used to implement the parallelism on the second level. The thesis is devoted to the study of this second level of parallelism and includes the stages described below. The algebraic additive Schwarz (AAS) domain-decomposition preconditioner is an integral part of the solution process. We seek to understand its performance on the parallel computers which we target and we introduce an improved construction approach for the parallel precon- ditioner. We examine a novel strategy for solving the AAS subdomain problems, using multiple MPI processes. At the subdomain level, this is represented by the ShyLU preconditioner. We bring improvements to its algorithm in the form of a novel inexact solver based on an incomplete QR (IQR) factorization. The performance of the new preconditioner framework is studied for Laplacian and advection-diffusion-reaction (ADR) problems and for Navier-Stokes problems, as a component within a larger framework of specialized preconditioners. The partitioning of the computational mesh comes with considerable memory limitations, when done at runtime on parallel computers, due to the low amount of available memory per processor. We describe and implement a solution to this problem, based on offloading the partitioning process to a preliminary offline stage of the simulation process. We also present the efficient implementation, based on parallel MPI collective instructions, of the routines which load the mesh parts during the simulation. We discuss an alternative parallel implementation of the finite element system assembly based on multi-threading. This new approach is used to supplement the existing one based on MPI parallelism, in situations where MPI alone can not make use of all the available parallel hardware resources. The work presented in the thesis has been done in the framework of two software projects: the Trilinos project and the LifeV parallel finite element modeling library. All the new develop- ments have been contributed back to the respective projects, to be used freely in subsequent public releases of the software.

Details

Title Parallel algorithms and efficient implementation techniques for finite element approximations

Author(s) Popescu, Radu

Advisor(s)

Quarteroni, Alfio
Deparis, Simone

Date 2013

Publisher Lausanne, EPFL

Keywords

finite element method; parallel preconditioners; MPI; multi-threading

Language English

DOI https://doi.org/10.5075/epfl-thesis-5980

Other identifier(s) urn: urn:nbn:ch:bel-epfl-thesis5980-6

Laboratories CMCS

Record Appears in Scientific production and competences > SB - School of Basic Sciences > SB Archives > CMCS - Chair of Modelling and Scientific Computing
Scientific production and competences > SB - School of Basic Sciences > Mathematics
Scientific production and competences > EPFL Theses
Work produced at EPFL
Published
Theses

Record creation date 2013-11-11

Actions

Preview

Select file: