Scale-up Graph Processing in the Cloud: Challenges and Solutions

Processing large graphs is an important part of the big-data problem. Recently a number of scale-up systems such as X-Stream, Graphchi and Turbograph have been proposed for processing large graphs using secondary storage on a single machine. The design and evaluation of these systems however have focused on physical machines. We expect that a natural evolution of such systems is to the cloud where a virtual machine would run the graph processing algorithm and access the graph from secondary storage remotely connected through the network. We evaluate a state of the art graph processing system called X-Stream in EC2 to identify challenges in this space. Our primary finding is that the network bandwidth between a virtual machine and remote storage becomes the limiter for performance. We show that this bottleneck can be somewhat alleviated through the use of VM local instance storage, network provisioning and compression.


Published in:
Proceedings of the Fourth International Workshop on Cloud Data and Platforms
Presented at:
CloudDP’14: Fourth International Workshop on Cloud Data and Platforms, Amsterdam, Netherlands, April 13-16,2014
Year:
2014
Publisher:
ACM New York, NY, USA ©2014
ISBN:
978-1-4503-2714-5
Keywords:
Laboratories:




 Record created 2014-10-14, last modified 2018-09-13

n/a:
Download fulltext
PDF

Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)