When we crowdsource data for tasks like SRL and sentiment analysis we only care about accuracy. For tasks where workers write new content, such as paraphrasing and creating questions, we also care about data diversity. If our data is not diverse then models trained on it will not be robust in the real world. The core idea of this paper is to encourage creativity by constraining workers.
My [previous post](https://www.jkk.name/post/2020-09-25_crowdqasrl/) discussed work on crowdsourcing QA-SRL, a way of capturing semantic roles in text by asking workers to answer questions. This post covers a paper I contributed to that also considers crowdsourcing SRL, but collects the more traditional form of annotation used in resources like Propbank.
Semantic Role Labeling captures the content of a sentence by labeling the word sense of the verbs and identifying their arguments. Over the last few years, [Luke Zettlemoyer's Group](https://www.cs.washington.edu/people/faculty/lsz/) has been exploring using question-answer pairs to represent this structure. This approach has the big advantage that it is easier to explain than the sense inventory and role types of more traditional SRL resources like PropBank. However, even with that advantage, crowdsourcing this annotation is difficult, as this paper shows.
A range of services exist for collecting annotations from paid workers. This post gives an overview of a bunch of them.
For a more flexible dialogue system, use the crowd to propose and vote on responses, then introduce agents and a model for voting, gradually learning to replace the crowd.
By dividing a task up among multiple annotators carefully we can achieve high-quality real-time annotation of data, in this case transcription of audio.
By using a generative model to explain worker annotations, we can more effectively predict the correct label, and which workers are spamming.