Smaller, Faster & Lighter KNN Graph Constructions

We propose GoldFinger, a new compact and fast-to-compute binary representation of datasets to approximate Jaccard’s index. We illustrate the effectiveness of GoldFinger on the emblematic big data problem of K-Nearest-Neighbor (KNN) graph construction and show that GoldFinger can drastically accelerate a large range of existing KNN algorithms with little to no overhead. As a side effect, we also show that the compact representation of the data protects users’ privacy for free by providing k-anonymity and l-diversity. Our extensive evaluation of the resulting approach on several realistic datasets shows that our approach delivers speedups of up to 78.9% compared to the use of raw data while only incurring a negligible to moderate loss in terms of KNN quality. To convey the practical value of such a scheme, we apply it to item recommendation and show that the loss in recommendation quality is negligible.

Published in:
Proceedings of The Web Conference 2020, 1060-1070
Presented at:
The Web Conference 2020, April 2020
This is an open-access article distributed under the terms of the Creative Commons Attribution (CC-BY) Licence.
Other identifiers:

Note: The status of this file is: Anyone

 Record created 2020-06-08, last modified 2020-06-08

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)