Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Practical erasure codes for storage systems: The study of entanglement codes, an approach that propagates redundancy to increase reliability and performance
 
doctoral thesis

Practical erasure codes for storage systems: The study of entanglement codes, an approach that propagates redundancy to increase reliability and performance

Estrada-Galiñanes, Vero  
2017

This dissertation deals with the design of practical erasure codes for storage systems. Hardware and logical disk failures are a common source of system failures that may lead to data loss. Nevertheless, it is predicted that spinning disks would remain the standard storage medium in large datacenters. Cloud storage needs efficient codes to become reliable despite its low-cost components. As systems scale in size and complexity, their properties and requirements may change. When data ages, it is usually moved to dedicated archives. Yet the boundaries between storage systems and archives are getting diffuse as we move into applications that require low latency access such as mining data from large scientific archives. Moreover, the centralized approach of cloud backup services brings privacy and economics concerns. Some studies suggest that cooperative peer-to-peer networks are more sustainable for the long term. But peer-to-peer nodes and spinning disks share an undesirable property: both are unreliable. The motivation for this study is to design flexible and practical codes that can provide high fault-tolerance to improve data durability and availability even in catastrophic scenarios. Survivability comes through the strength built with redundancy. It is difficult to devise a solu- tion based on classic codes that considers all aspects of dependability: availability, reliability, safety, integrity and maintainability. Compromises are generally found through the complex combination of many techniques. This thesis argues that codes that are based exclusively on the use of parallel networks (such as replication) or mainly on the use of serial networks (as it is seen in the split and expand operations behind classic erasure codes) do not leverage all the resources available in a system. Entanglement codes create redundancy by tangling new data blocks with old ones, building entangled data chains that are woven into a growing mesh of interdependent content. We propose: 1) open and close entanglements as more reliable alter- natives than mirroring, 2) alpha entanglements to achieve extremely high fault-tolerance with low storage overhead and low repair costs, and 3) spigot codes to reduce the space footprint from entangled data without significant loss of the entanglement’s properties. These codes can leverage storage and bandwidth resources efficiently by exploiting the combinatorial power of network reliability. Furthermore, their flexible design based on virtual chains of entangled data yields a scalable and suitable solution to accommodate future requirements. Finally, due to the combinatorial power of entangled data, all in all, dependability is boosted.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
Author(s)
Estrada-Galiñanes, Vero  
Advisors
Felber, Pascal
Jury

Peter Kropf

Pascal Felber

Patrick T. Eugster

Ethan L. Miller

Date Issued

2017

Publisher

University of Neuchâtel

Total of pages

191

Subjects

erasure codes

•

data entanglement

•

redundancy

•

replication

•

combinatorics

•

network reliability

•

fault tolerance

•

survivability

•

availability

•

scalable systems

•

distributed systems

•

storage systems

•

long-term storage

•

archival storage

Note

pdf file with 191 pages

EPFL units
DEDIS  
Available on Infoscience
May 5, 2024
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/207798
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés