PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution (Chen et al., 2018)

The OntoNotes dataset, which is the focus of almost all coreference resolution research, had several compromises in its development (as is the case for any dataset). Some of these are discussed in…

Evaluating the Utility of Hand-crafted Features in Sequence Labelling (Minghao Wu et al., 2018)

A common argument in favour of neural networks is that they do not require ‘feature engineering’, manually defining functions that produce useful representations of the input data (e.g. a function…

Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples (Vidur Joshi et al., 2018)

Virtually all systems trained using data have trouble when applied to datasets that differ even slightly - even switching from Wall Street…

The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models (Weber et al., 2018)

We know that training a neural network involves optimising over a non-convex space, but using standard evaluation methods we see that our models…

An Analysis of Neural Language Modeling at Multiple Scales (Merity et al., 2018)

Assigning a probability distribution over the next word or character in a sequence (language modeling) is a useful component of many systems…

Provenance for Natural Language Queries (Deutch et al., 2017)

Being able to query a database in natural language could help make data accessible …

Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning (Tsvetkov et al., 2016)

Reordering training sentences for word vectors may impact their usefulness for downstream tasks.

Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations (Wieting et al., 2017)

With enough training data, the best vector representation of a sentence is to concatenate an average over word vectors and an average over character trigram vectors.

Evorus: A Crowd-powered Conversational Assistant Built to Automate Itself Over Time (Huang et al., 2018)

For a more flexible dialogue system, use the crowd to propose and vote on responses, then introduce agents and a model for voting, gradually learning to replace the crowd.

A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings (Yang et al., 2017)

To leverage out-of-domain data, learn multiple sets of word vectors but with a loss term that encourages them to be similar.