Distributed Similarity Search in High Dimensions Using Locality Sensitive Hashing

Haghani, Parisa; Michel, Sebastian; Aberer, Karl

conference paper

Haghani, Parisa

•

Michel, Sebastian

•

Aberer, Karl

2009

Proceedings of the 12th International Conference on Extending Database Technology (EDBT'09)

12th International Conference on Extending Database Technology (EDBT)

In this paper we consider distributed K-Nearest Neighbor (KNN) search and range query processing in high dimensional data. Our approach is based on Locality Sensitive Hashing (LSH) which has proven very efficient in answering KNN queries in centralized settings. We consider mappings from the multi-dimensional LSH bucket space to the linearly ordered set of peers that jointly maintain the indexed data and derive requirements to achieve high quality search results and limit the number of network accesses. We put forward two such mappings that come with these salient properties: being locality preserving so that buckets likely to hold similar data are stored on the same or neighboring peers and having a predictable output distribution to ensure fair load balancing. We show how to leverage the linearly aligned data for efficient KNN search and how to efficiently process range queries which is, to the best of our knowledge, not possible in existing LSH schemes. We show by comprehensive performance evaluations using real world data that our approach brings major performance and accuracy gains compared to state-of-the-art.

Type

conference paper

Author(s)

Haghani, Parisa

Michel, Sebastian

Aberer, Karl

Date Issued

2009

Published in

Proceedings of the 12th International Conference on Extending Database Technology (EDBT'09)

Subjects

K-nearest neighbor search

•

similarity search

•

P2P

•

LSH

•

NCCR-MICS

•

NCCR-MICS/CL4

URL

http://www.math.spbu.ru/edbticdt/

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units

LSIR

Event name	Event place	Event date
12th International Conference on Extending Database Technology (EDBT)	Saint-Petersburg, Russia	March 23-26 2009

Available on Infoscience

November 17, 2008

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/31220