Why STM can be more than a Research Toy
Software Transactional Memory (STM) promises to simplify concurrent programming without requiring specific hardware support. Yet, STM’s credibility lies on the extent to which it enables to leverage multicores and outperform sequential code. A recent CACM paper  questioned this ability and suggested the confinement of STM to a research toy. We revisit these conclusions through the most to date extensive comparison of STM performance to sequential code. We evaluate a state-of-the-art STM system, SwissTM, on a wide range of benchmarks and two different multicore systems. We dissect the inherent costs of synchronization as well as the overheads of compiler instrumentation and transparent privatization. Our results show that an STM with manually instrumented benchmarks and explicit privatization outperforms sequential code by up to 29 times on SPARC with 64 concurrent threads and by up to 9 times on x86 with 16 concurrent threads. Indeed the overheads of compiler instrumentation and transparent privatization are substantial, yet they do not prevent STM from generally outperforming sequential code.