Decentralized in-order execution of a sequential task-based code for shared-memory architectures

Castes, Charly; Agullo, Emmanuel; Aumage, Olivier; Saillard, Emmanuelle

doi:10.1109/IPDPSW55747.2022.00095

conference paper

Decentralized in-order execution of a sequential task-based code for shared-memory architectures

Castes, Charly

•

Agullo, Emmanuel

•

Aumage, Olivier

more

January 1, 2022

2022 Ieee 36Th International Parallel And Distributed Processing Symposium Workshops (Ipdpsw 2022)

36th IEEE International Parallel and Distributed Processing Symposium (IEEE IPDPS)

The hardware complexity of modern machines makes the design of adequate programming models crucial for jointly ensuring performance, portability, and productivity in high-performance computing (HPC). Sequential task-based programming models paired with advanced runtime systems allow the programmer to write a sequential algorithm independently of the hardware architecture in a productive and portable manner, and let a third party software layer -the runtime system- deal with the burden of scheduling a correct, parallel execution of that algorithm to ensure performance. Many HPC algorithms have successfully been implemented following this paradigm, as a testimony of its effectiveness.

Developing algorithms that specifically require fine-grained tasks along this model is still considered prohibitive, however, due to per-task management overhead [1], forcing the programmer to resort to a less abstract, and hence more complex "task+X" model. We thus investigate the possibility to offer a tailored execution model, trading dynamic mapping for efficiency by using a decentralized, conservative in-order execution of the task flow, while preserving the benefits of relying on the sequential taskbased programming model. We propose a formal specification of the execution model as well as a prototype implementation, which we assess on a shared-memory multicore architecture with several synthetic workloads. The results show that under the condition of a proper task mapping supplied by the programmer, the pressure on the runtime system is significantly reduced and the execution of fine-grained task flows is much more efficient.

Type

conference paper

DOI

10.1109/IPDPSW55747.2022.00095

Web of Science ID

WOS:000855041000069

Authors

Castes, Charly

•

Agullo, Emmanuel

•

Aumage, Olivier

•

Saillard, Emmanuelle

Publication date

2022-01-01

Publisher

IEEE COMPUTER SOC

Published in

2022 Ieee 36Th International Parallel And Distributed Processing Symposium Workshops (Ipdpsw 2022)

ISBN of the book

978-1-6654-9747-3

Publisher place

Los Alamitos

Series title/Series vol.

IEEE International Symposium on Parallel and Distributed Processing Workshops

Start page

552

End page

561

Subjects

Computer Science, Har...

Computer Science, The...

Computer Science

Peer reviewed

REVIEWED

EPFL units

DCSL

Event name	Event place	Event date
36th IEEE International Parallel and Distributed Processing Symposium (IEEE IPDPS)	ELECTR NETWORK	May 30-Jun 03, 2022

Available on Infoscience

October 10, 2022

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/191312