Fingerprinting Big Data: The Case of KNN Graph Construction

We propose fingerprinting, a new technique that consists in constructing compact, fast-to-compute and privacy-preserving binary representations of datasets. We illustrate the effectiveness of our approach on the emblematic big data problem of K-Nearest-Neighbor (KNN) graph construction and show that fingerprinting can drastically accelerate a large range of existing KNN algorithms, while efficiently obfuscating the original data, with little to no overhead. Our extensive evaluation of the resulting approach (dubbed GoldFinger) on several realistic datasets shows that our approach delivers speedups of up to 78.9% compared to the use of raw data while only incurring a negligible to moderate loss in terms of KNN quality.

Published in:
Proceedings of the 35th International Conference on Data Engineering, 1738-1741
Presented at:
2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, Macao, Macao, April 8-11 2019

Note: The status of this file is: Anyone

 Record created 2019-07-02, last modified 2020-04-20

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)