On Leveraging Crowdsourcing Techniques for Schema Matching Networks

Nguyen, Quoc Viet Hung; Nguyen Thanh, Tam; Miklós, Zoltán; Aberer, Karl

doi:10.1007/978-3-642-37450-0_10

Nguyen, Quoc Viet Hung; Nguyen Thanh, Tam; Miklós, Zoltán; Aberer, Karl

2013

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

As the volumes of shared datasets are likely to soar, there is high demand of establishing the interlinking between these public datasets. To achieve this, we need to match their schemas altogether. Moreover, to extract useful pieces of information, ones might need to link not only two, but several schemas. This has led to the establishment of a set of attribute correspondences between multiple schemas that construct a \emph{schema matching network}. Various commercial and academic schema matching tools have been developed to support this task. However, as the matching is inherently uncertain, the developed heuristic techniques give rise to results that are not completely correct. In practice, post-matching human expert effort is needed to obtain a correct set of attribute correspondences. Addressing this problem, our paper demonstrates how to leverage crowdsourcing techniques to validate the generated correspondences. We design validation questions with contextual information that can effectively guide the crowd workers. We analyze how to reduce overall human effort needed for this validation task. Through theoretical and empirical results, we show that by harnessing natural constraints defined on top of the schema matching network, the necessary human work is reduced significantly.