Sequence Effects in Crowdsourced Annotations (Mathur et al., 2017)

Annotator sequence bias, where the label for one item affects the label for the next, occurs across a range of datasets. Avoid it by separately randomise the order of items for each annotator.

Getting high quality annotations from crowdsourcing requires careful design. This paper looks at how one annotation a worker does can influence their next annotation, for example:

  • When scoring translations, a good example may make the next one look worse in comparison
  • For labeling tasks, we may expect a long sequence of the same label to be rare (the gambler’s fallacy)

To investigate this they fit a linear model with inputs (previous label, gold label, random noise) and see what the coefficients are. Across multiple tasks, there is a non-zero correlation with the previous label. Interestingly, there also seems to be a learning effect for good workers, where over time they become calibrated and show less sequence bias. Fortunately, there is a simple solution - for each worker, give every annotator their documents in a different random order! With that change, averaging over annotations should avoid this bias.



author    = {Mathur, Nitika  and  Baldwin, Timothy  and  Cohn, Trevor},
title     = {Sequence Effects in Crowdsourced Annotations},
title: = {Sequence Effects in Crowdsourced Annotations},
booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing},
month     = {September},
year      = {2017},
address   = {Copenhagen, Denmark},
publisher = {Association for Computational Linguistics},
pages     = {2860--2865},
url       = {}