Distributed shared memory 8DSM) is an abstraction of shared memory on a distributed memory machine. Hardware DSM systems support this abstraction at the architecture level; software DSM systems support the abstraction within the runtime system. One of the key problems in building an efficient software DSM system is to reduce the amount of communication needed to keep the distributed memories consistent. In this paper we present four techniques for doing so: 1) software release consistency; 2) multiple consistency protocols; (3) write-shared protocols; and (4) an update-with-timeout mechanism. These techniques have been implemented in the Munin DSM system. We compare the performance of seven Munin application programs, first to their performance when implemented using message passing, and then to their performance when running on a conventional software DSM system that does not embody the above techniques. On a 16-processor cluster of workstations, Munin’s performance is within 5% of message passing for four out of the seven applications. For the other three, performance is within 29% to 33%. Detailed analysis of two of these three applications indicates that the addition of a function shipping capability would bring their performance to within 7% of the message passing performance. Compared to a conventional DSM system, Munin achieves performance improvements ranging from a few to several hundred percent, depending on the application.