Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Fault-tolerant dynamic parallel schedules
 
doctoral thesis

Fault-tolerant dynamic parallel schedules

Gerlach, Sebastian  
2006

Dynamic Parallel Schedules (DPS) is a high-level framework for developing parallel applications on distributed memory computers such as clusters of PCs. DPS applications are defined by using directed acyclic flow graphs composed of user-defined operations. These operations derive from basic concepts provided by the framework: split, merge, leaf and stream operations. Whereas a simple parallel application can be expressed with a split-leaf-merge sequence of operations, flow graphs of arbitrary complexity can be created. DPS provides run-time support for dynamically mapping flow graph operations onto the nodes of a cluster. The flow graph based application description used in DPS allows the framework to offer many additional features, most of these transparently to the application developer. In order to maximize performance, DPS applications benefit from automatic overlapping of computations and communications and from implicit pipelining. The framework provides simple primitives for flow control and load balancing. Applications can integrate flow graph parts provided by other applications as parallel components. Since the mapping of DPS applications to processing nodes can be dynamically changed at runtime, DPS provides a basis for developing malleable applications. The DPS framework provides a complete fault tolerance mechanism based on the dynamic mapping capabilities, ensuring continued execution of parallel applications even in the presence of multiple node failures. DPS is provided as an open-source, cross-platform C++ library allowing DPS applications and services to run on heterogeneous clusters.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-3471
Author(s)
Gerlach, Sebastian  
Advisors
Hersch, Roger-David  
Jury

Bastien Chopard, Ralf Gruber, Herbert Kuchen, Claude Petitpierre

Date Issued

2006

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2006-03-10

Thesis number

3471

Total of pages

157

Subjects

parallel programming

•

fault-tolerance

•

dynamic resource allocation

•

parallel application frameworks

•

flow graphs

•

parallel schedules

•

programmation parallèle

•

tolérance aux pannes

•

allocation dynamique de ressources

•

frameworks de développement d'applications parallèles

•

graphes de flux

•

parallel schedules

EPFL units
LSP  
Faculty
IC  
Section
IC-SIN  
School
ISIM  
Available on Infoscience
January 23, 2006
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/221697
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés