Straight to the Queue: Fast Load-Store Queue Allocation in Dataflow Circuits

Elakhras, Ayatallah; Sawhney, Riya; Guerrieri, Andrea; Josipovic, Lana; Ienne, Paolo

doi:10.1145/3543622.3573050

conference paper

Straight to the Queue: Fast Load-Store Queue Allocation in Dataflow Circuits

Elakhras, Ayatallah

•

Sawhney, Riya

•

Guerrieri, Andrea

January 1, 2023

Proceedings Of The 2023 Acm/Sigda International Symposium On Field Programmable Gate Arrays, Fpga 2023

31st ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA)

Dynamically scheduled high-level synthesis can exploit high levels of parallelism in poorly-predictable control-dominated applications. Yet, dataflow circuits are often generated by literal conversion of basic blocks into circuits interconnected in such a way as to mimic the program's sequential execution. Although correct and quite effective in many cases, this adherence to control flow still significantly limits exploitable parallelism. Recent research introduced techniques to deliver data tokens directly from producers to consumers and achieved tangible benefits both in circuit complexity and execution time. Unfortunately, while this successfully addressed ordinary data dependencies, the problem of potential dependencies through memory remains open: When no technique can statically disambiguate accesses, circuits must be built with load-store queues (LSQs) which, to reorder accesses safely, need memory accesses to be allocated in the queues in program order. Such in-order allocation still demands control circuitry emulating sequential execution, with its negative impact on parallelization. In this paper, we transform potential memory dependencies into virtual data dependencies and use the new direct token delivery strategy to allocate accesses sequentially into the LSQ. In other words, we exploit more parallelism by constructing control circuitry to emulate exclusively those parts of the control flow strictly necessary for in-order allocation. Our results show that we can achieve up to a 74% reduction in execution time compared to prior work, in some cases, at no area cost.

Type

conference paper

DOI

10.1145/3543622.3573050

Web of Science ID

WOS:001147764100005

Author(s)

Elakhras, Ayatallah

Sawhney, Riya

Guerrieri, Andrea

Josipovic, Lana

Ienne, Paolo

Corporate authors

ACM

Date Issued

2023-01-01

Publisher

Assoc Computing Machinery

Publisher place

New York

Published in

Proceedings Of The 2023 Acm/Sigda International Symposium On Field Programmable Gate Arrays, Fpga 2023

ISBN of the book

978-1-4503-9417-8

Start page

39

End page

45

Subjects

Technology

•

High-Level Synthesis

•

Dataflow

•

Load-Store Queue

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units

LAP

Event name	Event place	Event date
31st ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA)	Monterey, CA	FEB 12-14, 2023

Funder	Grant Number
Huawei

Available on Infoscience

February 23, 2024

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/205343