Transparent scalable database replication

Elnikety, Sameh Mohamed

doi:10.5075/epfl-thesis-3925

Elnikety, Sameh Mohamed

2008

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

This thesis demonstrates that, despite earlier projections to the contrary, database replication on a cluster of low-cost commodity servers can transparently scale database performance. In a replicated system providing the standard transactional properties – atomicity, consistency, isolation and durability (ACID) – in an efficient and scalable manner is a challenging task. The problem stems from updates, which have to be synchronized, propagated and reflected at other replicas. Reflecting the updates of one replica at other replicas often interacts poorly with single-machine implementations of the ACID properties. Moreover, because databases are usually part of complex IT infrastructures, replication should ideally be transparent to both clients and database servers, and should therefore be implemented as middleware. This thesis proposes a concurrency control model called generalized snapshot isolation, which is used to replicate snapshot isolated databases in a transparent and scalable manner. We build a prototype system, Tashkent, and assess its performance experimentally. We develop an analytical model to predict replicated database performance using measurements from a standalone database. We introduce a series of optimizations: uniting ordering with durability, memory-aware load balancing and update filtering. The replication functionality, including these optimizations, is implemented in middleware, making it attractive for deployment in real systems. Tashkent exhibits superlinear speedup: Experimental results demonstrate that running the TPC-W ordering mix on a 16-replica cluster provides 37 times the peak throughput of a standalone database.

Details

Title Transparent scalable database replication

Author(s) Elnikety, Sameh Mohamed

Advisor(s)

Zwaenepoel, Willy

Pagination 126

Date 2008

Publisher Lausanne, EPFL

Keywords

concurrency control; distributed database systems; experimental systems; gestion du contrôle de la concurrence; bases de données distribuées; systèmes expérimentaux

Language English

DOI https://doi.org/10.5075/epfl-thesis-3925

Other identifier(s) urn: urn:nbn:ch:bel-epfl-thesis3925-8

Laboratories LABOS

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > LABOS - Operating Systems Laboratory
Scientific production and competences > EPFL Theses
Work produced at EPFL
Published
Theses

Record creation date 2007-09-03

Files

Abstract

Details

PDF