Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Aggregation of a Term Vocabulary for Peer-to-Peer Information Retrieval: a DHT Stress Test
 
conference paper

Aggregation of a Term Vocabulary for Peer-to-Peer Information Retrieval: a DHT Stress Test

Klemm, Fabius  
•
Aberer, Karl  
2005
LNCS
Third International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2005)

There has been an increasing research interest in developing full-text retrieval based on peer-to-peer (P2P) technology. So far, these research efforts have largely concentrated on efficiently distributing an index. However, ranking of the results retrieved from the index is a crucial part in information retrieval. To determine the relevance of a document to a query, ranking algorithms use collection-wide statistics. Term frequency - inverse document frequency (TFIDF), for example, is based on frequencies of documents containing a given term in the whole collection. Such global frequencies are not readily available in a distributed system. In this paper, we study the feasibility of aggregating global frequencies for a large term vocabulary in a P2P setting. We use a distributed hash table (DHT) for our analysis. Traditional applications of DHTs, such as file sharing, index keys in the order of tens of thousands. Aggregation of a vocabulary consisting of millions of terms poses extreme requirements to a DHT implementation. We study different aggregation strategies and propose optimizations to DHTs to efficiently process large numbers of keys.

  • Files
  • Details
  • Metrics
Type
conference paper
Author(s)
Klemm, Fabius  
Aberer, Karl  
Date Issued

2005

Published in
LNCS
Subjects

P2P

•

DHT

•

IR

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LSIR  
Event nameEvent placeEvent date
Third International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2005)

Trondheim, Norway

August, 2005

Available on Infoscience
October 19, 2005
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/217980
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés