Information Extraction on the Web with Credibility Guarantee

Nguyen, Thanh Tam

Nguyen, Thanh Tam

2015

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

The Web became the central medium for valuable sources of information extraction applications. However, such user-generated resources are often plagued by inaccuracies and misinformation due to the inherent openness and uncertainty of the Web. In this work we study the problem of extracting structured information out of Web data with a credibility guarantee. The ultimate goal is that not only the structured information should be extracted as much as possible but also its credibility is high. To achieve this goal, we propose a learning process to optimize the parameters of a probabilistic model that captures the relationships between users, their unstructured contents, and the underlying structured information. Our evaluations on real-world datasets show that our approach outperforms the baseline up to 6 times.

Details

Title Information Extraction on the Web with Credibility Guarantee

Author(s) Nguyen, Thanh Tam

Pagination 8

Date 2015

Keywords

trust management; credibility; information extraction; web data; quality guarantee

Laboratories LSIR

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > LSIR - Distributed Information Systems Laboratory
Work produced at EPFL
Technical Reports

Record creation date 2016-03-25

Files

Abstract

Details

PDF