Abstract

Participatory sensing has become a new data collection paradigm that leverages the wisdom of the crowd for big data applications without spending cost to buy dedicated sensors. It collects data from human sensors by using their own devices such as cell phone accelerometers, cameras, and GPS devices. This benefit comes with a drawback: human sensors are arbitrary and inherently uncertain due to the lack of quality guarantee. Moreover, participatory sensing data are time series that exhibit not only highly irregular dependencies on time but also high variance between sensors. To overcome these limitations, we formulate the problem of validating uncertain time series collected by participatory sensors. In this article, we approach the problem by an iterative validation process on top of a probabilistic time series model. First. we generate a series of probability distributions from raw data by tailoring a state-of-the-art dynamical model, namely Generalised Auto Regressive Conditional Heteroskedasticity (GARCH), for our joint time series setting. Second, we design a feedback process that consists of an adaptive aggregation model to unify the joint probabilistic time series and an efficient user guidance model to validate aggregated data with minimal effort. Through extensive experimentation, we demonstrate the efficiency and effectiveness of our approach on both real data and synthetic data. Highlights from our experiences include the fast running time of a probabilistic model, the robustness of an aggregation model to outliers, and the significant effort saving of a guidance model.

Details