Multi-Label Answer Aggregation for Crowdsourcing

Nguyen, Thanh Tam

Nguyen, Thanh Tam

2016

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Crowdsourcing has been widely established as a means to enable human computation at large scale, in particular for tasks that require manual labelling of large sets of data items. Answers obtained from heterogeneous crowd workers are aggregated to obtain a robust result. However, existing methods for answer aggregation assume that answers are given as a single label per item. Hence, these methods are ineffective for common multi-labelling problems such as image tagging and document annotation, where items are assigned sets of labels. In this paper, we propose a novel Bayesian nonparametric model for multi-label answer aggregation. It enables us to predict labels for non-grounded items, while taking into account dependencies between the labels in different answer sets. We also show how this model is instantiated for incremental learning, incorporating new answers from crowd workers as they arrive. An evaluation of our method using a number of large-scale, real-world crowdsourcing datasets reveals that it consistently outperforms the state-of-the-art in answer aggregation in terms of precision, recall, and robustness against faulty workers and data sparsity.

Details

Title Multi-Label Answer Aggregation for Crowdsourcing

Author(s) Nguyen, Thanh Tam

Pagination 13

Date 2016

Keywords

crowdsourcing; multi-label; aggregation

Laboratories LSIR

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > LSIR - Distributed Information Systems Laboratory
Work produced at EPFL
Technical Reports

Record creation date 2016-02-13

Files

Abstract

Details

PDF