We revisit the idea of using small line buffers in-front of caches. We propose ReCast, a tiny tag set cache that filters a significant number of tag probes to the L2 tag array thus reducing power. The key contribution in ReCast is S-Shift, a simple indexing function (no logic involved just wires) that greatly improves the utility of line buffers with no additional hardware cost. S-Shift can be viewed as a technique for emulating larger cache blocks and hence exploiting more spatial locality but without paying the penalties of actually using a larger L2 cache block. Using several SPEC CPU2000 applications and a model of an aggressive, dynamically-scheduled, superscalar processor we demonstrate that a practical ReCast organization can significantly reduce power in the L2. Specifically, a 64-entry ReCast comprising eight sub-banks of eight entries each can filter about 50% of all tag probes for a 1Mbyte L2 cache. A conventional line buffer of the same size filters only 32% of all tag probes. The resulting average reduction in L2 tag power is 38% and 85% with writeback or writethrough L1 caches respectively. This translates to a reduction of 16% or 52% of the overall L2 power respectively. We also analyze a few representative applications explaining why S-Shift works well. © 2005 IEEE.