Personalizing Top-k Processing Online in a Peer-to-Peer Social Tagging Network
The rapidly increasing amount of user-generated content in social tagging systems provides a huge source of information. Yet, performing effective search in these systems is very challenging, especially when we seek the most appropriate items that match a potentially ambiguous query. Collaborative filtering-based personalization is appealing in this context, as it limits the search within a small network of participants with similar preferences. Offline personalization, which consists in maintaining, for every user, a network of similar participants based on their tagging behaviors, is effective for queries that are close to the querying user's tagging profile but performs poorly when the queries, reflecting emerging interests, have little correlation with the querying user's profile. We present (PTK2)-T-2, the first protocol to personalize query processing in social tagging systems online. (PTK2)-T-2 is completely decentralized, and this design choice stems from the observation that the evolving social tagging systems naturally resemble P2P systems where users are both producers and consumers. This design exploits the power of the crowd and prevents any central authority from controlling personal information. (PTK2)-T-2 is gossip-based and probabilistic. It dynamically associates each user with social acquaintances sharing similar tagging behaviors. Appropriate users for answering a query are discovered at query time with the help of social acquaintances. This is achieved according to the hybrid interest of the querying user, taking into account both her tagging behavior and her query. Results are iteratively refined and returned to the querying user. We evaluate (PTK2)-T-2 on CiteULike and Delicious traces involving up to 50,000 users. We highlight the advantages of online personalization compared to offline personalization, as well as its efficiency, scalability, and inherent ability to cope with user departure and interest evolution in P2P systems.