Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Scalable Peer-to-Peer Web Retrieval with Highly Discriminative Keys
 
conference paper

Scalable Peer-to-Peer Web Retrieval with Highly Discriminative Keys

Podnar, Ivana
•
Rajman, Martin  
•
Luu, Vinh Toan
Show more
2007
Proceedings of the 23rd International Conference on Data Engineering
IEEE 23rd International Conference on Data Engineering (ICDE 2007)

The suitability of peer-to-peer (P2P) approaches for full-text Web retrieval has recently been questioned because of the claimed unacceptable bandwidth consumption induced by retrieval from very large document collections. In this contribution we formalize a novel indexing/retrieval model that achieves high performance, cost-efficient retrieval by indexing with highly discriminative keys (HDKs) stored in a distributed global index maintained in a structured P2P network. HDKs correspond to carefully selected terms and term sets appearing in a small number of collection documents. We provide a theoretical analysis of the scalability of our retrieval model and report experimental results obtained with our HDK-based P2P retrieval engine. These results show that, despite increased indexing costs, the total traffic generated with the HDK approach is significantly smaller than the one obtained with distributed single-term indexing strategies. Furthermore, our experiments show that the retrieval performance obtained with a random set of real queries is comparable to the one of centralized, single-term solution using the best state-of-the-art BM25 relevance computation scheme. Finally, our scalability analysis demonstrates that the HDK approach can scale to large networks of peers indexing Web-size document collections, thus opening the way towards viable, truly-decentralized Web retrieval.

  • Files
  • Details
  • Metrics
Type
conference paper
DOI
10.1109/ICDE.2007.368968
Author(s)
Podnar, Ivana
Rajman, Martin  
Luu, Vinh Toan
Klemm, Fabius  
Aberer, Karl  
Date Issued

2007

Publisher

IEEE

Published in
Proceedings of the 23rd International Conference on Data Engineering
Start page

1096

End page

1105

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LSIR  
LIA  
Event nameEvent placeEvent date
IEEE 23rd International Conference on Data Engineering (ICDE 2007)

Istanbul, Turkey

April 15-20, 2007

Available on Infoscience
November 18, 2011
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/72666
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés