Mining, Modeling and Predicting Mobility

Mobility is a central aspect of our life, and our movements reveal much more about us than simply our whereabouts. In this thesis, we are interested in mobility and study it from three different perspectives: the modeling perspective, the information-theoretic perspective, and the data mining perspective. For the modeling perspective, we represent mobility as a probabilistic process described by both observable and latent variables, and we introduce formally the notion of individual and collective dimensions in mobility models. Ideally, we should take advantage of both dimensions to learn accurate mobility models, but the nature of data might limit us. We take a data-driven approach to study three scenarios, which differ on the nature of mobility data, and present, for each scenario, a mobility model that is tailored for it. The first scenario is individual-specific as we have mobility data about individuals but are unable to cross reference data from them. In the second scenario, we introduce the collective model that we use to overcome the sparsity of individual traces, and for which we assume that individuals in the same group exhibit similar mobility patterns. Finally, we present the ideal scenario, for which we can take advantage of both the individual and collective dimensions, and analyze collective mobility patterns in order to create individual models. In the second part of the thesis, we take an information-theoretic approach in order to quantify mobility uncertainty and its evolution with location updates. We discretize the user’s world to obtain a map that we represent as a mobility graph. We model mobility as a random walk on this graph —equivalent to a Markov chain —and quantify trajectory uncertainty as the entropy of the distribution over possible trajectories. In this setting, a location update amounts to conditioning on a particular state of the Markov chain, which requires the computation of the entropy of conditional Markov trajectories. Our main result enables us to compute this entropy through a transformation of the original Markov chain. We apply our framework to real-world mobility datasets and show that the influence of intermediate locations on trajectory entropy depends on the nature of these locations. We build on this finding and design a segmentation algorithm that uncovers intermediate destinations along a trajectory. The final perspective from which we analyze mobility is the data mining perspective: we go beyond simple mobility and analyze geo-tagged data that is generated by online social medias and that describes the whole user experience. We postulate that mining geo-tagged data enables us to obtain a rich representation of the user experience and all that surrounds its mobility. We propose a hierarchical probabilistic model that enables us to uncover specific descriptions of geographical regions, by analyzing the geo-tagged content generated by online social medias. By applying our method to a dataset of 8 million geo-tagged photos, we are able to associate with each neighborhood the tags that describe it specifically, and to find the most unique neighborhoods in a city.

Related material