Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Chaos: Scale-out Graph Processing from Secondary Storage
 
conference paper not in proceedings

Chaos: Scale-out Graph Processing from Secondary Storage

Roy, Amitabha  
•
Bindschaedler, Laurent  
•
Malicevic, Jasmina  
Show more
2015
25th Symposium on Operating Systems Principles

Chaos scales graph processing from secondary storage to multiple machines in a cluster. Earlier systems that process graphs from secondary storage are restricted to a single ma- chine, and therefore limited by the bandwidth and capacity of the storage system on a single machine. Chaos is limited only by the aggregate bandwidth and capacity of all storage devices in the entire cluster. Chaos builds on the streaming partitions introduced by X-Stream in order to achieve sequential access to storage, but parallelizes the execution of streaming partitions. Chaos is novel in three ways. First, Chaos partitions for sequential storage access, rather than for locality and load balance, re- sulting in much lower pre-processing times. Second, Chaos distributes graph data uniformly randomly across the clus- ter and does not attempt to achieve locality, based on the observation that in a small cluster network bandwidth far outstrips storage bandwidth. Third, Chaos uses work steal- ing to allow multiple machines to work on a single partition, thereby achieving load balance at runtime. In terms of performance scaling, on 32 machines Chaos takes on average only 1.66 times longer to process a graph 32 times larger than on a single machine. In terms of capacity scaling, Chaos is capable of handling a graph with 1 trillion edges representing 16 TB of input data, a new milestone for graph processing capacity on a small commodity cluster.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

Chaos_sosp2015.pdf

Type

Publisher's Version

Version

Published version

Access type

openaccess

Size

229.13 KB

Format

Adobe PDF

Checksum (MD5)

c8bee3b701d686282e9429c1e1a266ba

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés