Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. A Novel Parallel QR Algorithm For Hybrid Distributed Memory HPC Systems
 
research article

A Novel Parallel QR Algorithm For Hybrid Distributed Memory HPC Systems

Granat, Robert
•
Kagstrom, Bo
•
Kressner, Daniel  
2010
SIAM Journal On Scientific Computing

A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on hybrid distributed high performance computing systems is presented. For this purpose, we introduce the concept of multiwindow bulge chain chasing and parallelize aggressive early deflation. The multiwindow approach ensures that most computations when chasing chains of bulges are performed in level 3 BLAS operations, while the aim of aggressive early deflation is to speed up the convergence of the QR algorithm. Mixed MPI-OpenMP coding techniques are utilized for porting the codes to distributed memory platforms with multithreaded nodes, such as multicore processors. Numerous numerical experiments confirm the superior performance of our parallel QR algorithm in comparison with the existing ScaLAPACK code, leading to an implementation that is one to two orders of magnitude faster for sufficiently large problems, including a number of examples from applications.

  • Files
  • Details
  • Metrics
Type
research article
DOI
10.1137/090756934
Author(s)
Granat, Robert
Kagstrom, Bo
Kressner, Daniel  
Date Issued

2010

Published in
SIAM Journal On Scientific Computing
Volume

32

Issue

4

Start page

2345

End page

2378

Subjects

eigenvalue problem

•

nonsymmetric QR algorithm

•

multishift

•

bulge chasing

•

parallel computations

•

level 3 performance

•

aggressive early deflation

•

parallel algorithms

•

hybrid distributed memory systems

•

Aggressive Early Deflation

•

Algebraic Riccati-Equations

•

Matrix Sign Function

•

Level 3 Blas

•

Qz Algorithm

•

Schur Forms

•

Performance

•

Reduction

•

Architectures

•

Variants

Editorial or Peer reviewed

REVIEWED

Written at

OTHER

EPFL units
ANCHP  
Available on Infoscience
May 5, 2011
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/67090
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés