Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Towards better entity resolution techniques for Web document collections
 
conference paper

Towards better entity resolution techniques for Web document collections

Yerva, Surender Reddy  
•
Miklós, Zoltán  
•
Aberer, Karl  
2010
1st International Workshop on Data Engineering meets the Semantic Web (DESWeb'2010) (co-located with ICDE'2010)
1st International Workshop on Data Engineering meets the Semantic Web (DESWeb'2010) (co-located with ICDE'2010)

As person names are non-unique, the same name on different Web pages might or might not refer to the same real-world person. This entity identification problem is one of the most challenging issues in realizing the Semantic Web or entity-oriented search. We address this disambiguation problem, which is very similar to the entity resolution problem studied in relational databases, however there are also several differences. Most importantly Web pages often only contain partial or incomplete information about the persons, moreover the available information is very heterogeneous, thus we are only able to obtain some uncertain evidence about whether two names refer to the same person using similarity functions. These similarity functions capture some aspects of the similarities between Web-pages, where the names occur, thus they perform very differently for the different names. We analyze some data engineering techniques to cope with the limited accuracy of the similarity functions and to combine multiple functions. Even with our simple techniques we could demonstrate systematic performance improvements and produce comparable results to state-of-the-art methods.

  • Files
  • Details
  • Metrics
Type
conference paper
DOI
10.1109/ICDEW.2010.5452698
Author(s)
Yerva, Surender Reddy  
Miklós, Zoltán  
Aberer, Karl  
Date Issued

2010

Publisher

IEEE

Published in
1st International Workshop on Data Engineering meets the Semantic Web (DESWeb'2010) (co-located with ICDE'2010)
Start page

209

End page

214

Subjects

Entity Resolution

•

Semantic Web

•

okkam

•

NCCR-MICS/ESDM

•

NCCR-MICS

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LSIR  
Event nameEvent placeEvent date
1st International Workshop on Data Engineering meets the Semantic Web (DESWeb'2010) (co-located with ICDE'2010)

Long Beach, California

2010

Available on Infoscience
June 17, 2010
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/50902
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés