Conference paper

Network Multicomputing Using Recoverable Distributed Shared Memory

A network multicomputer is a multiprocessor in which the processors are connected by general-purpose networking technology, in contrast to current distributed memory multiprocessors where a dedicated special-purpose interconnect is used. The advent of high-speed general-purpose networks provides the impetus for a new look at the network multiprocessor model, by removing the bottleneck of current slow networks. However, major software issues remain unsolved. It is pointed out that a convenient machine abstraction must be developed that hides from the application programmer low-level details such as message passing or machine failures. Use is made of distributed shared memory as a programming abstraction, and rollback recovery through consistent checkpointing to provide fault tolerance. Measurements of the authors' implementations of distributed shared memory and consistent checkpointing show that these abstractions can be implemented efficiently


    • LABOS-CONF-2005-051

    Record created on 2005-10-20, modified on 2016-08-08

Related material