Privacy in Recommender Systems

Taziki, Mahsa

doi:10.5075/epfl-thesis-8797

doctoral thesis

Privacy in Recommender Systems

2018

The upsurge in the number of web users over the last two decades has resulted in a significant growth of online information. Recommenders are machine learning approach and are becoming one of the main ways to navigate the Internet. They recommend appropriate items to users based on their clicks, i.e., likes, ratings, purchases, etc. These clicks are key to providing relevant recommendations and, in this sense, have a significant utility. Since clicks reflect the preferences of users, they also raise privacy concerns. More generally, the rise of connected personal devices together with privacy concerns call for machine learning algorithms capable of leveraging the data of a large number of agents to learn personalized models under strong privacy requirements.

In this dissertation, we explore how to measure the above-mentioned utility and privacy effects and their inherent relation. We tackle the privacy and utility relation from two perspectives, i.e. from the users' point of view and from the service providers' point of view. First, the analytical measurements of the utility and privacy effects of a click lead to the awareness of users, hence the actionable insights for the users to superintend their behavior over the internet. Additionally, privacy-preserving machine learning algorithms enable the service providers to improve their services while preserving the privacy of their users.
In this regards, we further introduce two privacy-preserving algorithms: first, a recommendation algorithm, and the other one, a stochastic gradient descent (SGD) mechanism.

We first precisely measure the utility and privacy effects of a single click. We determine exactly when utility and privacy are antagonists and when they are not. An appealing application of our metrics is what we call a click-advisor, a visual user-aware clicking platform that helps users decide whether it is actually worth clicking on an item or not (after evaluating its potential utility and privacy effects using our techniques).
Using a game-theoretic approach, we evaluate several user clicking strategies.
We highlight in particular what we define as a \textit{smart} strategy, leading to a Nash equilibrium, where every user reaches the maximum possible privacy while preserving the average overall recommender utility for all users (with respect to the case where the clicks of users are based solely on their genuine preferences, i.e., without consulting the click-advisor).

\indent We then propose two privacy-preserving machine learning algorithms.
We present D2P, a novel protocol that provides a strong form of differential privacy (distance-based differential privacy) while ensuring a good recommendation quality. D2P avoids revealing exact user profiles by creating \emph{altered} user profiles where each item is replaced with another one at some \emph{distance}. % We evaluate D2P analytically and experimentally on MovieLens and Jester datasets. We show that when compared to a recommendation protocol that does not ensure any privacy, the recommendation quality of D2P drops only by $3$% for both datasets.
Lastly, we introduce an efficient SGD algorithm to address the privacy concerns in a fully decentralized (peer-to-peer) and asynchronous fashion, with provable convergence rate.

Name

EPFL_TH8797.pdf

Access type

restricted

Size

4.41 MB

Format

Adobe PDF

Checksum (MD5)

fe6a5bf8349b164ac28386e4aa7a5ab4