Memory Systems and Interconnects for Scale-Out Servers

The information revolution of the last decade has been fueled by the digitization of almost all human activities through a wide range of Internet services. The backbone of this information age are scale-out datacenters that need to collect, store, and process massive amounts of data. These datacenters distribute vast datasets across a large number of servers, typically into memory-resident shards so as to maintain strict quality-of-service guarantees. While data is driving the skyrocketing demands for scale-out servers, processor and memory manufacturers have reached fundamental efficiency limits, no longer able to increase server energy efficiency at a sufficient pace. As a result, energy has emerged as the main obstacle to the scalability of information technology (IT) with huge economic implications. Delivering sustainable IT calls for a paradigm shift in computer system design. As memory has taken a central role in IT infrastructure, memory-centric architectures are required to fully utilize the IT's costly memory investment. In response, processor architects are resorting to manycore architectures to leverage the abundant request-level parallelism found in data-centric applications. Manycore processors fully utilize available memory resources, thereby increasing IT efficiency by almost an order of magnitude. Because manycore server chips execute a large number of concurrent requests, they exhibit high incidence of accesses to the last-level-cache for fetching instructions (due to large instruction footprints), and off-chip memory (due to lack of temporal reuse in on-chip caches) for accessing dataset objects. As a result, on-chip interconnects and the memory system are emerging as major performance and energy-efficiency bottlenecks in servers. This thesis seeks to architect on-chip interconnects and memory systems that are tuned for the requirements of memory-centric scale-out servers. By studying a wide range of data-centric applications, we uncover application phenomena common in data-centric applications, and examine their implications on on-chip network and off-chip memory traffic. Finally, we propose specialized on-chip interconnects and memory systems that leverage common traffic characteristics, thereby improving server throughput and energy efficiency.

    Keywords: cloud ; scale-out ; datacenters ; interconnects ; memory systems ; DRAM

    Thèse École polytechnique fédérale de Lausanne EPFL, n° 6682 (2015)
    Programme doctoral Informatique et Communications
    Faculté informatique et communications
    Institut des systèmes informatiques et multimédias
    Laboratoire d'architecture de systèmes parallèles
    Jury: Prof. Willy Zwaenepoel (président) ; Prof. Babak Falsafi (directeur de thèse) ; Prof. Paolo Ienne, Prof. Rajeev Balasubramonian, Prof. Yuan Xie (rapporteurs)

    Public defense: 2015-9-18


    Record created on 2015-09-08, modified on 2016-08-09


Related material