Modeling annotator behaviors for crowd labeling

Kara, Yunus Emre; Genc, Gaye; Aran, Oya; Akarun, Lale

doi:10.1016/j.neucom.2014.10.082

research article

Modeling annotator behaviors for crowd labeling

Kara, Yunus Emre

•

Genc, Gaye

•

Aran, Oya

more

2015

Neurocomputing

Machine learning applications can benefit greatly from vast amounts of data, provided that reliable labels are available. Mobilizing crowds to annotate the unlabeled data is a common solution. Although the labels provided by the crowd are subjective and noisy, the wisdom of crowds can be captured by a variety of techniques. Finding the mean or finding the median of a sample׳s annotations are widely used approaches for finding the consensus label of that sample. Improving consensus extraction from noisy labels is a very popular topic, the main focus being binary label data. In this paper, we focus on crowd consensus estimation of continuous labels, which is also adaptable to ordinal or binary labels. Our approach is designed to work on situations where there is no gold standard; it is only dependent on the annotations and not on the feature vectors of the instances, and does not require a training phase. For achieving a better consensus, we investigate different annotator behaviors and incorporate them into four novel Bayesian models. Moreover, we introduce a new metric to examine annotator quality, which can be used for finding good annotators to enhance consensus quality and reduce crowd labeling costs. The results show that the proposed models outperform the commonly used methods. With the use of our annotator scoring mechanism, we are able to sustain consensus quality with much fewer annotations.

Type

research article

DOI

10.1016/j.neucom.2014.10.082

Authors

Kara, Yunus Emre

•

Genc, Gaye

•

Aran, Oya

•

Akarun, Lale

Publication date

2015

Published in

Neurocomputing

Volume

160

Start page

141

End page

156

Peer reviewed

REVIEWED

EPFL units

LIDIAP

Available on Infoscience

December 19, 2014

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/109460