000114767 001__ 114767
000114767 005__ 20190619003135.0
000114767 0247_ $$2doi$$a10.5075/epfl-thesis-3999
000114767 02470 $$2urn$$aurn:nbn:ch:bel-epfl-thesis3999-8
000114767 02471 $$2nebis$$a5472023
000114767 037__ $$aTHESIS
000114767 041__ $$aeng
000114767 088__ $$a3999
000114767 245__ $$aThe complexity of reliable distributed storage
000114767 269__ $$a2008
000114767 260__ $$bEPFL$$c2008$$aLausanne
000114767 300__ $$a131
000114767 336__ $$aTheses
000114767 502__ $$aDejan Kostic, Fernando Pedone, Luis E.T Rodrigues
000114767 520__ $$aDistributed storage systems based on commodity hardware have gained in popularity because they are cheaper, can be made more reliable and offer better scalability than centralized storage systems. However, implementing and managing such systems is more complicated due to their distributed nature. This thesis presents efficient and reliable distributed storage that can be read and written by any number of clients. The focus is on atomic storage, which guarantees that from the clients' perspective, the distributed storage behaves exactly like a centralized one. Three key complexity metrics – time, number of logs and throughput – are considered. For each metric, precise performance bounds together with matching algorithms are provided. Experimental results are used to confirm the theoretical performance analysis wherever necessary. Time-complexity is an indication of the latency of read and write operations, i.e. the time between a client's invocation of an operation and the response of the storage. This thesis presents optimal fast atomic storage implementations, namely, implementations that complete both reads and writes in 1 round-trip between the client and the servers. Interestingly, the existence of fast implementations depends on the maximum number of clients that can read from the storage. More precisely, it is shown that a fast implementation is possible if and only if the number of readers is less than n/f – 2, with n servers out of which f can fail. Furthermore, it is shown that fast implementations are impossible for multiple writers if servers can fail. Log-complexity is an indication of the number of stable storage (hard disk) accesses needed in every read or write operation. Stable storage is used to log data in order to prevent data loss after a crash, in a context where servers can crash and recover. This thesis revises the notion of atomicity for this context, determines a lower bound on log-complexity and introduces an atomic storage matching this bound. The optimality of the storage is also established in terms of resilience, as well as time-complexity. Throughput measures the average number of client requests that can be completed per time unit. In order for a distributed storage to serve a high number of clients concurrently, high throughput is required. This thesis introduces an atomic storage that provides optimal read throughput for homogeneous clusters of servers. The storage organizes servers around a ring and uses only point-to-point communication. It is resilient to the crash failure of any number of readers and writers as well as to the crash failure of all but one server. The storage was evaluated on a cluster of 24 nodes. The same storage is modified to solve the more general uniform total order broadcast problem, which can be used to replicate any application reliably. Thus, the first uniform total order broadcast algorithm that is throughput optimal, regardless of message broadcast patterns, is introduced. The algorithm is based on a ring topology, only relies on point-to-point inter-server communication, and has a linear latency with respect to the number of processes. The implementation was benchmarked against two of the most widely used group communication packages and the results confirm that the algorithm is indeed throughput optimal.
000114767 6531_ $$adistributed storage
000114767 6531_ $$aatomic storage
000114767 6531_ $$acrash-recovery
000114767 6531_ $$atotal order
000114767 6531_ $$acomplexity
000114767 6531_ $$athroughput
000114767 6531_ $$asystème de stockage réparti
000114767 6531_ $$asémantique atomique
000114767 6531_ $$ajournalisation
000114767 6531_ $$aordre total
000114767 6531_ $$adébit
000114767 6531_ $$acomplexité
000114767 700__ $$0(EPFLAUTH)114144$$g114144$$aLevy, Ron
000114767 720_2 $$aGuerraoui, Rachid$$edir.$$g105326$$0240335
000114767 8564_ $$uhttps://infoscience.epfl.ch/record/114767/files/EPFL_TH3999.pdf$$zTexte intégral / Full text$$s904180$$yTexte intégral / Full text
000114767 909C0 $$xU10407$$0252114$$pDCL
000114767 909CO $$pthesis-public$$pDOI$$pIC$$ooai:infoscience.tind.io:114767$$qGLOBAL_SET$$qDOI2$$pthesis$$pthesis-bn2018
000114767 918__ $$dEDHP$$bIC-SSC$$cIIF$$aIC
000114767 919__ $$aLPD
000114767 920__ $$b2008$$a2008-3-7
000114767 970__ $$a3999/THESES
000114767 973__ $$sPUBLISHED$$aEPFL
000114767 980__ $$aTHESIS