Scale-Out Processors

Lotfi-Kamran, Pejman; Grot, Boris; Ferdman, Michael; Volos, Stavros; Kocberber, Onur; Picorel, Javier; Adileh, Almutaz; Jevdjic, Djordje; Idgunji, Sachin; Ozer, Emre; Falsafi, Babak

doi:10.1145/2366231.2337217

Lotfi-Kamran, Pejman; Grot, Boris; Ferdman, Michael; Volos, Stavros; Kocberber, Onur; Picorel, Javier; Adileh, Almutaz; Jevdjic, Djordje; Idgunji, Sachin; Ozer, Emre; Falsafi, Babak

2012

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Scale-out datacenters mandate high per-server throughput to get the maximum benefit from the large TCO investment. Emerging applications (e.g., data serving and web search) that run in these datacenters operate on vast datasets that are not accommodated by on-die caches of existing server chips. Large caches reduce the die area available for cores and lower performance through long access latency when instructions are fetched. Performance on scale-out workloads is maximized through a modestly-sized last-level cache that captures the instruction footprint at the lowest possible access latency. In this work, we introduce a methodology for designing scalable and efficient scale-out server processors. Based on a metric of performance-density, we facilitate the design of optimal multi-core configurations, called pods. Each pod is a complete server that tightly couples a number of cores to a small last-level cache using a fast interconnect. Replicating the pod to fill the die area yields processors which have optimal performance density, leading to maximum per-chip throughput. Moreover, as each pod is a stand-alone server, scale-out processors avoid the expense of global (i.e., interpod) interconnect and coherence. These features synergistically maximize throughput, lower design complexity, and improve technology scalability. In 20nm technology, scale-out chips improve throughput by 5x-6.5x over conventional and by 1.6x-1.9x over emerging tiled organizations.

Details

Title Scale-Out Processors

Author(s) Lotfi-Kamran, Pejman ; Grot, Boris ; Ferdman, Michael ; Volos, Stavros ; Kocberber, Onur ; Picorel, Javier ; Adileh, Almutaz ; Jevdjic, Djordje ; Idgunji, Sachin ; Ozer, Emre ; Falsafi, Babak

Published in Proceedings of the 39th Annual International Symposium on Computer Architecture

Pagination 12

Series Conference Proceedings Annual International Symposium on Computer Architecture

Conference 39th Annual International Symposium on Computer Architecture, Portland, Oregon, USA, June 9-13, 2012

Date 2012

Publisher New York, Ieee

ISBN 978-1-4503-1642-2

Keywords

Scale-Out Processors; Scale-Out Workloads; Processor Efficiency; Performance Density; Datacenters

DOI https://doi.org/10.1145/2366231.2337217

Other identifier(s) View record in Web of Science

Laboratories PARSA

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > PARSA - Parallel Systems Architecture Laboratory
Peer-reviewed publications
Conference Papers
Work produced at EPFL
Published

Record creation date 2012-04-20

Actions

Preview

Select file: