Abstract

Future parallel computers must efficiently execute not only hand-coded applications but also programs written in high-level, parallel programming languages. Today’s machines limit these programs to a single communication paradigm, either message-passing or shared-memory,which results in uneven performanceC. Thisp apera ddresses this problem by defining an interface, Tempesr. that exposes low-level communication and memory-system mechanisms so programmers and compilers can customize policies for a given application. Typhoon is a proposed hardwarep latform that implementst hesem echanismsw ith a fully-programmable. user-level processor in the network interface. We demonstrate the utility of Tempest with two examples. First. the St&e protocol uses Tempest’s finegrain access control mechanisms to manage part of a processor’s local memory as a large, fully-associative cache for remote data. We simulated Typhoon on the Wisconsin Wind Tunnel and found that Stache running on Typhoon performs comparably (30%) to an all-hardware DirNNB cache-coherence protocol for five shared-memory programs. Second, we illustrate how programmers or compilers can use Tempest’s flexibility to exploit an application’s sharing patterns with a custom protocol. For the EM3D application, the custom protocol improves performance up to 35% over the all-hardware protocol.

Details