Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. How Many CPU Cores is an FPGA Worth? Lessons Learned from Accelerating String Sorting on a CPU-FPGA System
 
research article

How Many CPU Cores is an FPGA Worth? Lessons Learned from Accelerating String Sorting on a CPU-FPGA System

Asiatici, Mikhail  
•
Maiorano, Damian
•
Ienne, Paolo  
September 23, 2021
Journal Of Signal Processing Systems For Signal Image And Video Technology

String sorting is a fundamental kernel of string matching and database index construction; yet, it has not been studied as extensively as fixed-length keys sorting. Because processing variable-length keys in hardware is challenging, it is no surprise that no hardware-accelerated string sorters have been proposed yet. In this paper, we present Parallel Hybrid Super Scalar String Sample Sort (pHS(5)) on Intel HARPv2, a heterogeneous CPU-FPGA system with a server-grade CPU. Our pHS(5) extends pS(5), the state-of-the-art string sorting algorithm for multi-core shared memory CPUs, by adding multiple processing elements (PEs) on the FPGA. Each PE accelerates one instance of the most effectively parallelizable among the dominant kernels of pS(5) by up to 33% compared to a single Intel Xeon Broadwell core despite a clock frequency that is 17 times slower. Furthermore, we extended the job scheduling mechanism of pS(5) to schedule the accelerable kernel not only among available CPU cores but also on our PEs, while retaining the complex high-level control flow and the sorting of the smaller data sets on the CPU. Overall, we accelerate the entire algorithm by up to 10% with respect to the 28-thread software baseline running on the Xeon processor and by up to 36% at lower thread counts. Finally, we generalize our results assuming pS(5) as representative of software that is heavily optimized for modern multi-core CPUs and investigate the performance and energy advantage that an FPGA in a datacenter setting can offer to regular RTL users compared to additional CPU cores.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

s11265-021-01686-8.pdf

Type

Publisher's Version

Version

http://purl.org/coar/version/c_970fb48d4fbd8a85

Access type

openaccess

License Condition

CC BY

Size

1.38 MB

Format

Adobe PDF

Checksum (MD5)

6657546fbf29826e3a14262f82e26951

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés