Large-Scale Graph Processing on FPGAs with Caches for Thousands of Simultaneous Misses

Asiatici, Mikhail; Ienne, Paolo

doi:10.1109/ISCA52012.2021.00054

conference paper

Large-Scale Graph Processing on FPGAs with Caches for Thousands of Simultaneous Misses

Asiatici, Mikhail

•

Ienne, Paolo

January 1, 2021

2021 Acm/Ieee 48Th Annual International Symposium On Computer Architecture (Isca 2021)

ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)

Efficient large-scale graph processing is crucial to many disciplines. Yet, while graph algorithms naturally expose massive parallelism opportunities, their performance is limited by the memory system because of irregular memory accesses. State-of-the-art FPGA graph processors, such as ForeGraph and FabGraph, address the memory issues by using scratchpads and regularly streaming edges from DRAM, but then they end up wasting bandwidth on unneeded data. Yet, where classic caches and scratchpads fail to deliver, FPGAs make powerful unorthodox solutions possible. In this paper, we resort to extreme nonblocking caches that handle tens of thousands of outstanding read misses. They significantly increase the ability of memory systems to coalesce multiple accelerator accesses into fewer DRAM memory requests; essentially, when latency is not the primary concern, they bring the advantages expected from a very large cache at a fraction of the cost. We prove our point with an adaptable graph accelerator running on Amazon AWS f1; our implementation takes into account all practical aspects of such a design, including the challenges involved when working with modern multidie FPGAs. Running classic algorithms (PageRank, SCC, and SSSP) on large graphs, we achieve 3x geometric mean speedup compared to state-of-the-art FPGA accelerators, 1.1-5.8x higher bandwidth efficiency and 3.0-15.3x better power efficiency than multicore CPUs, and we support much larger graphs than the state-of-the-art on GPUs.

Type

conference paper

DOI

10.1109/ISCA52012.2021.00054

Web of Science ID

WOS:000702275600045

Authors

Asiatici, Mikhail

•

Ienne, Paolo

Publication date

2021-01-01

Publisher

IEEE COMPUTER SOC

Published in

2021 Acm/Ieee 48Th Annual International Symposium On Computer Architecture (Isca 2021)

ISBN of the book

978-1-6654-3333-4

Publisher place

Los Alamitos

Series title/Series vol.

Conference Proceedings Annual International Symposium on Computer Architecture

Start page

609

End page

622

Subjects

graph

moms

nonblocking cache

dram

fpga

gpu

Peer reviewed

REVIEWED

EPFL units

LAP

Event name	Event place	Event date
ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)	ELECTR NETWORK	Jun 14-19, 2021

Available on Infoscience

October 23, 2021

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/182411