Gupta, SiddharthOh, YunhoYan, LeiSutherland, Mark JohnathonBhattacharjee, AbhishekFalsafi, BabakHsu, Peter2022-12-022022-12-022022-12-02202310.1109/HPCA56546.2023.10070955https://infoscience.epfl.ch/handle/20.500.14299/192870Modern datacenters host datasets in DRAM to offer large-scale online services with tight tail-latency requirements. Unfortunately, as DRAM is expensive and increasingly difficult to scale, datacenter operators are forced to consider denser storage technologies. While modern flash-based storage exhibits us-scale access latency, which is well within the tail-latency constraints of many online services, traditional demand paging abstraction used to manage memory and storage incurs high overheads and prohibits flash usage in online services. We introduce AstriFlash, a hardware-software co-design that tightly integrates flash and DRAM with ns-scale overheads. Our evaluation of server workloads with cycle-accurate full-system simulation shows that AstriFlash achieves 95% of a DRAM-only system's throughput while maintaining the required 99th-percentile tail latency and reducing the memory cost by 20x.memory hierarchyswitch-on-miss architectureNAND flashAstriFlash: A Flash-Based System for Online Servicestext::conference output::conference paper not in proceedings