Automatic Generation of Efficient Accelerators for Reconfigurable Hardware

Koeplinger, David; Prabhakar, Raghu; Zhang, Yaqi; Delimitrou, Christina; Kozyrakis, Christos; Olukotun, Kunle

doi:10.1109/Isca.2016.20

Koeplinger, David; Prabhakar, Raghu; Zhang, Yaqi; Delimitrou, Christina; Kozyrakis, Christos; Olukotun, Kunle

2016

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

Acceleration in the form of customized datapaths offer large performance and energy improvements over general purpose processors. Reconfigurable fabrics such as FPGAs are gaining popularity for use in implementing application-specific accelerators, thereby increasing the importance of having good high-level FPGA design tools. However, current tools for targeting FPGAs offer inadequate support for high-level programming, resource estimation, and rapid and automatic design space exploration. We describe a design framework that addresses these challenges. We introduce a new representation of hardware using parameterized templates that captures locality and parallelism information at multiple levels of nesting. This representation is designed to be automatically generated from high-level languages based on parallel patterns. We describe a hybrid area estimation technique which uses template-level models and design-level artificial neural networks to account for effects from hardware place-and-route tools, including routing overheads, register and block RAM duplication, and LUT packing. Our runtime estimation accounts for off-chip memory accesses. We use our estimation capabilities to rapidly explore a large space of designs across tile sizes, parallelization factors, and optional coarse-grained pipelining, all at multiple loop levels. We show that estimates average 4.8% error for logic resources, 6.1% error for runtimes, and are 279 to 6533 times faster than a commercial high-level synthesis tool. We compare the best-performing designs to optimized CPU code running on a server-grade 6 core processor and show speedups of up to 16.7x.

Details

Title Automatic Generation of Efficient Accelerators for Reconfigurable Hardware

Author(s) Koeplinger, David ; Prabhakar, Raghu ; Zhang, Yaqi ; Delimitrou, Christina ; Kozyrakis, Christos ; Olukotun, Kunle

Published in 2016 Acm/Ieee 43Rd Annual International Symposium On Computer Architecture (Isca)

Pagination 13

Series Conference Proceedings Annual International Symposium on Computer Architecture

Pages 115-127

Conference 43rd ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), Seoul, SOUTH KOREA, JUN 18-22, 2016

Date 2016

Publisher New York, Ieee

ISSN 1063-6897

ISBN 978-1-4673-8947-1

DOI https://doi.org/10.1109/Isca.2016.20

Other identifier(s) View record in Web of Science

Laboratories SAIL

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IC Archives > SAIL - Systems, Architecture and Infrastructure Laboratory
Peer-reviewed publications
Conference Papers
Work produced at EPFL
Published

Record creation date 2017-01-24