In cloud storage systems, hot data is usually replicated over multiple disks/servers in order to accommodate simultaneous access by multiple users as well as increase the fault tolerance of the system. Recent cloud storage research has proposed using availability codes, which is a special class of erasure codes, as a more storage-efficient way to store hot data. These codes enable data recovery from multiple, small disjoint groups of servers. The number of the recovery groups is referred to as the availability and the size of each group as the locality of the code. Up till now, we have very limited knowledge on how code locality and availability affect data access time. Data download from these systems involves multiple fork-join queues operating in-parallel, making the analysis of access time a very challenging problem. In this paper, we present an analysis of average data access time in storage systems employing simplex codes, which are an important, in certain sense optimal, class of availability codes. We generalize the analysis for codes with locality 2 and any degree of availability. Specifically, using a queueing theoretic approach, we derive bounds and approximations on the average response time for two different Poisson request arrival models. We also compare two scheduling strategies for reduced access time and load balancing.2
main.pdf
openaccess
1.21 MB
Adobe PDF
12aa0e5c4ec1aef8bb7c7b8ac2e029c4