data

Controlled Crowdsourcing for High-Quality QA-SRL Annotation (Roit, et al., ACL 2020)

Semantic Role Labeling captures the content of a sentence by labeling the word sense of the verbs and identifying their arguments. Over the last few years, [Luke Zettlemoyer's Group](https://www.cs.washington.edu/people/faculty/lsz/) has been exploring using question-answer pairs to represent this structure. This approach has the big advantage that it is easier to explain than the sense inventory and role types of more traditional SRL resources like PropBank. However, even with that advantage, crowdsourcing this annotation is difficult, as this paper shows.

ChartDialogs: Plotting from Natural Language Instructions (Shao and Nakashole, ACL 2020)

Natural language interfaces to computer systems are an exciting area with new workshops ([WNLI](https://www.aclweb.org/anthology/volumes/2020.nli-1/) at ACL and [IntEx-SemPar](https://intex-sempar.github.io/) at EMNLP), a range of datasets (including my own work on [text-to-SQL](/publication/acl18sql/)), and many papers. Most work focuses on either (1) commands for simple APIs, (2) generating a database query, or (3) generating general purpose code. This paper considers an interesting application: interaction with data visualisation tools.

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList (Ribeiro, et al., ACL 2020 Best Paper)

It is difficult to predict how well a model will work in the real world. Carefully curated test sets provide some signal, but only if they are large, representative, and have not been overfit to. This paper builds on two ideas for this problem: constructing challenge datasets and breaking performance down into subcategories. Together, these become a process of designing specific tests that measure how well a model handles certain types of variation in data.

A Large-Scale Corpus for Conversation Disentanglement (Kummerfeld et al., 2019)

This post is about my own paper to appear at ACL later this month. What is interesting about this paper will depend on your research interests, so that’s how I’ve broken down this blog post. A few key points first: Data and code are available on Github. The paper is also available.

PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution (Chen et al., 2018)

The OntoNotes dataset, which is the focus of almost all coreference resolution research, had several compromises in its development (as is the case for any dataset). Some of these are discussed in...

Frames: a corpus for adding memory to goal-oriented dialogue systems (El Asri et al., 2017)

A new dialogue dataset that has annotations of multiple plans (frames) and dialogue acts that indicate modifications to them.