CoShare: A Cost-effective Data Sharing System for Data Center Networks
Numerous research groups and other organizations collect data from popular data sources such as online social networks. This leads to the problem of data islands, wherein all this data is isolated and lying idly, without any use to the community at large. Using existing centralized solutions such as Dropbox to replicate data to all interested parties is prohibitively costly, given the large size of datasets. A practical solution is to use a Peer-to-Peer (P2P) approach to replicate data in a self-organized manner. However, existing P2P approaches focus on minimizing downloading time without taking into account the bandwidth cost. In this paper, we present CoShare, a P2P inspired decentralized cost effective sharing system for data replication. CoShare allows users to specify their requirements on data sharing tasks and maps these requirements into resource requirements for data transfer. Through extensive simulations, we demonstrate that CoShare finds the desirable tradeoffs for a given cost and performance while varying user requirements and request arrival rates.