SLICC: Self-Assembly of Instruction Cache Collectives for OLTP Workloads

Online transaction processing (OLTP) is at the core of many data center applications. OLTP workloads are known to have large instruction footprints that foil existing L1 instruction caches resulting in poor overall performance. Prefetching can reduce the impact of such instruction cache miss stalls; however, state-of-the-art solutions require large dedicated hardware tables on the order of 40KB in size. SLICC is a programmer transparent, low cost technique to minimize instruction cache misses when executing OLTP workloads. SLICC migrates threads, spreading their instruction footprint over several L1 caches. It exploits repetition within and across transactions, where a transaction’s first iteration prefetches the instructions for subsequent iterations or similar subsequent transactions. SLICC reduces instruction misses by 56% on average for TPC-C and TPC-E, thereby improving performance by 68%. When compared to a state-of-the-art prefetcher, and notwithstanding the increased storage overheads (42× as compared to SLICC), performance using SLICC is 21% higher for TPC-E and within 2% for TPC-C.


Published in:
Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture
Presented at:
The 45th Annual IEEE/ACM International Symposium on Microarchitecture, Vancouver, BC, Canada, December 1-5, 2012
Year:
2012
Keywords:
Note:
SYSTEMS PUBLICATION_SHORE_MT
Laboratories:




 Record created 2012-09-12, last modified 2018-09-13

n/a:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)