Stop Crying Over Your Cache Miss Rate: Handling Efficiently Thousands of Outstanding Misses in FPGAs

Asiatici, Mikhail; Ienne, Paolo

doi:10.1145/3289602.3293901

2019

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

FPGAs rely on massive datapath parallelism to accelerate applications even with a low clock frequency. However, applications such as sparse linear algebra and graph analytics have their throughput limited by irregular accesses to external memory for which typical caches provides little benefit because of very frequent misses. Non-blocking caches are widely used on CPUs to reduce the negative impact of misses and thus increase performance of applications with low cache hit rate; however, they rely on associative lookup for handling multiple outstanding misses, which limits their scalability, especially on FPGAs. This results in frequent stalls whenever the application has a very low hit rate. In this paper, we show that by handling thousands of outstanding misses without stalling we can achieve a massive increase of memory-level parallelism, which can significantly speed up irregular memory-bound latency-insensitive applications. By storing miss information in cuckoo hash tables in block RAM instead of associative memory, we show how a non-blocking cache can be modified to support up to three orders of magnitude more misses. The resulting miss-optimized architecture provides new Pareto-optimal and even Pareto-dominant design points in the area-delay space for twelve large sparse matrix-vector multiplication benchmarks, providing up to 25% speedup with 24x area reduction or to 2x speedup with similar area compared to traditional hit-optimized architectures.

Details

Title Stop Crying Over Your Cache Miss Rate: Handling Efficiently Thousands of Outstanding Misses in FPGAs

Author(s) Asiatici, Mikhail ; Ienne, Paolo

Published in Proceedings Of The 2019 Acm/Sigda International Symposium On Field-Programmable Gate Arrays (Fpga'19)

Pages 310-319

Conference ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), Feb 24-26, 2019, Seaside, CA

Date 2019-01-01

Publisher New York, ASSOC COMPUTING MACHINERY

ISBN 978-1-4503-6137-8

DOI https://doi.org/10.1145/3289602.3293901

Other identifier(s) View record in Web of Science

Laboratories LAP

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > LAP - Processor Architecture Laboratory
Peer-reviewed publications
Conference Papers
Work produced at EPFL
Published

Record creation date 2020-04-12

Abstract

Details

Actions