Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Straight to the Queue: Fast Load-Store Queue Allocation in Dataflow Circuits
 
conference paper

Straight to the Queue: Fast Load-Store Queue Allocation in Dataflow Circuits

Elakhras, Ayatallah  
•
Sawhney, Riya
•
Guerrieri, Andrea  
Show more
January 1, 2023
Proceedings Of The 2023 Acm/Sigda International Symposium On Field Programmable Gate Arrays, Fpga 2023
31st ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA)

Dynamically scheduled high-level synthesis can exploit high levels of parallelism in poorly-predictable control-dominated applications. Yet, dataflow circuits are often generated by literal conversion of basic blocks into circuits interconnected in such a way as to mimic the program's sequential execution. Although correct and quite effective in many cases, this adherence to control flow still significantly limits exploitable parallelism. Recent research introduced techniques to deliver data tokens directly from producers to consumers and achieved tangible benefits both in circuit complexity and execution time. Unfortunately, while this successfully addressed ordinary data dependencies, the problem of potential dependencies through memory remains open: When no technique can statically disambiguate accesses, circuits must be built with load-store queues (LSQs) which, to reorder accesses safely, need memory accesses to be allocated in the queues in program order. Such in-order allocation still demands control circuitry emulating sequential execution, with its negative impact on parallelization. In this paper, we transform potential memory dependencies into virtual data dependencies and use the new direct token delivery strategy to allocate accesses sequentially into the LSQ. In other words, we exploit more parallelism by constructing control circuitry to emulate exclusively those parts of the control flow strictly necessary for in-order allocation. Our results show that we can achieve up to a 74% reduction in execution time compared to prior work, in some cases, at no area cost.

  • Details
  • Metrics
Type
conference paper
DOI
10.1145/3543622.3573050
Web of Science ID

WOS:001147764100005

Author(s)
Elakhras, Ayatallah  
Sawhney, Riya
Guerrieri, Andrea  
Josipovic, Lana
Ienne, Paolo  
Corporate authors
ACM
Date Issued

2023-01-01

Publisher

Assoc Computing Machinery

Publisher place

New York

Published in
Proceedings Of The 2023 Acm/Sigda International Symposium On Field Programmable Gate Arrays, Fpga 2023
ISBN of the book

978-1-4503-9417-8

Start page

39

End page

45

Subjects

Technology

•

High-Level Synthesis

•

Dataflow

•

Load-Store Queue

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LAP  
Event nameEvent placeEvent date
31st ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA)

Monterey, CA

FEB 12-14, 2023

FunderGrant Number

Huawei

Available on Infoscience
February 23, 2024
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/205343
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés