Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Data-Aware Privacy-Preserving Machine Learning
 
doctoral thesis

Data-Aware Privacy-Preserving Machine Learning

Triastcyn, Aleksei  
2020

In this thesis, we focus on the problem of achieving practical privacy guarantees in machine learning (ML), where the classic differential privacy (DP) fails to maintain a good trade-off between user privacy and data utility. Differential privacy guarantee may be influenced by extreme outliers or samples outside of the data distribution to a large extent. For example, when trying to protect a classification model for magnetic resonance imaging (MRI), differentially private mechanisms would add the amount of noise sufficient to hide any image in the space of the same dimensionality. That includes images that do not belong to the intended data distribution (cars, houses, animals, and so on). Such generality inevitably yields poor privacy guarantees. Based on these observations and the ideas of DP, we propose a data-aware approach to privacy in machine learning. We design two novel privacy notions, Average-Case Differential Privacy (ADP) and Bayesian Differential Privacy (BDP), which allow to take into account the data distribution information and significantly improve the privacy-utility balance. First, we present average-case differential privacy, an empirical privacy notion designed for ex post privacy analysis of generative models and privacy-preserving data publishing. It relaxes the worst-case requirement of differential privacy to the average case and relies on empirical estimation to deal with undefined distributions. This notion can be regarded as a statistical sensitivity measure -- it measures the expected change in the model outcomes given a change in the inputs generated by an observed distribution. Second, we develop a more rigorous privacy notion, Bayesian differential privacy, based on the same high-level principle of probabilistic sensitivity measure. As the main theoretical contributions of this thesis, we formulate and prove basic properties of Bayesian DP, such as composition, group privacy, and resistance to post-processing, and we develop a novel privacy accounting method for iterative algorithms based on the advanced composition theorem. Furthermore, we show connections between our accountant and the well-known moments accountant, as well as between Bayesian DP and other privacy definitions. Our practical contributions and evaluation branch into three main areas: (1) privacy-preserving data release using generative adversarial networks (GANs); (2) private classification using convolutional neural networks and other ML models; and (3) private federated learning (FL) for both discriminative and generative models. We demonstrate that both notions allow to achieve considerably higher utility than differential privacy, and that Bayesian DP provides a superior trade-off between privacy guarantees and the output model quality in all settings.

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-7216
Author(s)
Triastcyn, Aleksei  
Advisors
Faltings, Boi  
Jury

Prof. Emre Telatar (président) ; Prof. Boi Faltings (directeur de thèse) ; Prof. Martin Jaggi, Prof. Han Yu, Prof. Christos Dimitrakakis (rapporteurs)

Date Issued

2020

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2020-10-21

Thesis number

7216

Total of pages

161

Subjects

privacy-preserving machine learning

•

privacy-preserving data release

•

differential privacy

•

deep learning

•

federated learning

•

generative adversarial networks

EPFL units
LIA  
Faculty
IC  
School
IINFCOM  
Doctoral School
EDIC  
Available on Infoscience
October 9, 2020
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/172377
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés