Abstract

We derive a framework for sampling online communities based on the mean hitting time of its members, considering that there are multiple graphs associated with the same vertex set V representing the social network. First, we formulate random walk models on the multi-graph ensemble and define the essential properties of the mean hitting times associated with the corresponding Markov chains on the vertex set V. Then, we design a branch and bound optimization technique for computing the subset of vertices A that exhibits the shortest mean hitting time across the multi-graph, given a constraint on the size of A. We also design a greedy optimization method that computes an approximation to the optimal subset, at lower complexity, and that lends itself to a decentralized implementation, for further complexity reduction. We examine the performance of the sampling framework through a series of simulation experiments involving synthetic and actual samples of online community graphs. We demonstrate substantial improvements in terms of sampling (network) cost reduction and information dissemination speed relative to the state-of-the-art methods of node degree and eigenvector centrality.

Details

Actions