Gem5-X: A Gem5-Based System Level Simulation Framework to Optimize Many-Core Platforms

The rapid expansion of online-based services requires novel energy and performance efficient architectures to meet power and latency constraints. Fast architectural exploration has become a key enabler in the proposal of architectural innovation. In this paper, we present gem5-X, a gem5-based system level simulation framework, and a methodology to optimize many-core systems for performance and power. As real-life case studies of many-core server workloads, we use real-time video transcoding and image classification using convolutional neural networks (CNNs). Gem5-X allows us to identify bottlenecks and evaluate the potential benefits of architectural extensions such as in-cache computing and 3D stacked High Bandwidth Memory. For real-time video transcoding, we achieve 15% speed-up using in-order cores with in-cache computing when compared to a baseline in-order system and 76% energy savings when compared to an Out-of-Order system. When using HBM, we further accelerate real-time transcoding and CNNs by up to 7% and 8% respectively.

Published in:
Proceedings of the 27th High Performance Computing Symposium (HPC 2019)
Presented at:
27th High Performance Computing Symposium (HPC 2019), SpringSim'19, Tucson, Arizona, USA, April 29 - May 2, 2019
Apr 29 2019

 Record created 2019-02-27, last modified 2019-08-12

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)