Synthetic Data - Anonymisation Groundhog Day

Stadler, Theresa; Oprisanu, Bristena; Troncoso, Carmela

conference paper

Stadler, Theresa

•

Oprisanu, Bristena

•

Troncoso, Carmela

January 1, 2022

Proceedings Of The 31St Usenix Security Symposium

31st USENIX Security Symposium

Synthetic data has been advertised as a silver-bullet solution to privacy-preserving data publishing that addresses the shortcomings of traditional anonymisation techniques. The promise is that synthetic data drawn from generative models preserves the statistical properties of the original dataset but, at the same time, provides perfect protection against privacy attacks. In this work, we present the first quantitative evaluation of the privacy gain of synthetic data publishing and compare it to that of previous anonymisation techniques.

Our evaluation of a wide range of state-of-the-art generative models demonstrates that synthetic data either does not prevent inference attacks or does not retain data utility. In other words, we empirically show that synthetic data does not provide a better tradeoff between privacy and utility than traditional anonymisation techniques. Furthermore, in contrast to traditional anonymisation, the privacy-utility tradeoff of synthetic data publishing is hard to predict. Because it is impossible to predict what signals a synthetic dataset will preserve and what information will be lost, synthetic data leads to a highly variable privacy gain and unpredictable utility loss. In summary, we find that synthetic data is far from the holy grail of privacy-preserving data publishing.

Type

conference paper

Web of Science ID

WOS:000855237502004

Author(s)

Stadler, Theresa

Oprisanu, Bristena

Troncoso, Carmela

Date Issued

2022-01-01

Publisher

USENIX ASSOC

Publisher place

Berkeley

Published in

Proceedings Of The 31St Usenix Security Symposium

ISBN of the book

978-1-939133-31-1

Start page

1451

End page

1468

Subjects

Computer Science, Information Systems

•

Computer Science, Theory & Methods

•

Computer Science

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units

SPRING

Event name	Event place	Event date
31st USENIX Security Symposium	Boston, MA	Aug 10-12, 2022

Available on Infoscience

December 5, 2022

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/192969