Probabilistic Queries in Large-Scale Networks

Resource location is a fundamental problem for large-scale distributed applications. This paper discusses the problem from a probabilistic perspective. Contrary to deterministic approaches, which strive to produce a precise outcome, probabilistic approaches may sometimes expose users with incorrect results. The paper formalizes the probabilistic resource-location problem with the notion of probabilistic queries. A probabilistic query has a predicate as parameter and returns a set of sites where the predicate is believed to hold. The query is probabilistic because there are some chances that the predicate does not hold in all, or even in any, of the sites returned. To implement probabilistic queries, we introduce psearch, an epidemic-like algorithm that uses basic concepts of Bayesian statistical inference. Among its properties, psearch is able to adapt itself to new system conditions caused, for example, by failures.

Related material