Eﬃcient Indexing and Query Processing of Model-View Sensor Data in the Cloud
As the number of sensors that pervade our lives increases (e.g., environmental sensors, phone sensors, etc.), the eﬃcient management of massive amount of sensor data is becoming increasingly important. The inﬁnite nature of sensor data poses a serious challenge for query processing even in a cloud infrastructure. Traditional raw sensor data management systems based on relational databases lack scalability to accommodate large-scale sensor data eﬃciently. Thus, distributed key-value stores in the cloud are becoming a prime tool to manage sensor data. Model-view sensor data management, which stores the sensor data in the form of modeled segments, brings the additional advantages of data compression and value interpolation. However, currently there are no techniques for indexing and/or query optimization of the model-view sensor data in the cloud; full table scan is needed for query processing in the worst case. In this paper, we propose an innovative index for modeled segments in key-value stores, namely KVI-index. KVI-index consists of two interval indices on the time and sensor value dimensions respectively, each of which has an in-memory search tree and a secondary list materialized in the key-value store. Then, we introduce a KVI-index–Scan–MapReduce hybrid approach to perform eﬃcient query processing upon modeled data streams. As proved by a series of experiments at a private cloud infrastructure, our approach outperforms in query-response time and index-updating eﬃciency both Hadoop-based parallel processing of the raw sensor data and multiple alternative indexing approaches of model-view data.