Wenisch, Thomas F.Somogyi, StephenHardavellas, NikosKim, JangwooAilamaki, AnastassiaFalsafi, Babak2007-10-162007-10-162007-10-16200510.1109/ISCA.2005.50https://infoscience.epfl.ch/handle/20.500.14299/12994Coherent read misses in shared-memory multiprocessors account for a substantial fraction of execution time in many important scientific and commercial workloads. We propose Temporal Streaming, to eliminate coherent read misses by streaming data to a processor in advance of the corresponding memory accesses. Temporal streaming dynamically identifies address sequences to be streamed by exploiting two common phenomena in shared-memory access patterns: (1) temporal address correlation — groups of shared addresses tend to be accessed together and in the same order, and (2) temporal stream locality — recently- accessed address streams are likely to recur. We present a practical design for temporal streaming. We evaluate our design using a combination of trace-driven and cycle- accurate full-system simulation of a cache-coherent distributed shared-memory system. We show that temporal streaming can eliminate 98% of coherent read misses in scientific applications, and between 43% and 60% in database and web server workloads. Our design yields speedups of 1.07 to 3.29 in scientific applications, and 1.06 to 1.21 in commercial workloads.Temporal Streaming of Shared Memorytext::conference output::conference proceedings::conference paper