Finding related pages in the World Wide Web

Dean, Jeff; Henzinger, Monika R.

doi:10.1016/S1389-1286(99)00022-5

Dean, Jeff; Henzinger, Monika R.

1999

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

When using traditional search engines, users have to formulate queries to describe their information need. This paper discusses a different approach to Web searching where the input to the search process is not a set of query terms, but instead is the URL of a page, and the output is a set of related Web pages. A related Web page is one that addresses the same topic as the original page. For example, www.washingtonpost.com is a page related to www.nytimes.com, since both are online newspapers. We describe two algorithms to identify related Web pages. These algorithms use only the connectivity information in the Web (i.e., the links between pages) and not the content of pages or usage information. We have implemented both algorithms and measured their runtime performance. To evaluate the effectiveness of our algorithms, we performed a user study comparing our algorithms with Netscape's `What's Related' service (http://home. netscape, com/escapes/related/). Our study showed that the precision at 10 for our two algorithms are 73% better and 51% better than that of Netscape, despite the fact that Netscape uses both content and usage pattern information in addition to connectivity information.

Details

Title Finding related pages in the World Wide Web

Author(s) Dean, Jeff ; Henzinger, Monika R.

Published in Comput. Networks

Volume 31

Issue 11

Pages 1467-1479

Date 1999

Keywords

Computer systems programming; Query languages; Response time (computer systems); Search engines; Web pages; World Wide Web

DOI https://doi.org/10.1016/S1389-1286(99)00022-5

Other identifier(s) View record in Scopus

Laboratories LTAA

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IC Archives > LTAA - Laboratory of Theory and Applications of Algorithms
Peer-reviewed publications
Work outside EPFL
Journal Articles
Published

Record creation date 2007-01-18

Actions

Preview

Select file: