A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings (Yang et al., 2017)

To leverage out-of-domain data, learn multiple sets of word vectors but with a loss term that encourages them to be similar.

The strange geometry of skip-gram with negative sampling (Mimno et al., 2017)

Surprisingly, word2vec (negative skipgram sampling) produces vectors that point in a consistent direction, a pattern not seen in GloVe (but also one that doesn’t seem to cause a problem for downstream tasks).

Sequence Effects in Crowdsourced Annotations (Mathur et al., 2017)

Annotator sequence bias, where the label for one item affects the label for the next, occurs across a range of datasets. Avoid it by separately randomise the order of items for each annotator.

High-risk learning: acquiring new word vectors from tiny data (Herbelot et al., 2017)

The simplest way to learn word vectors for rare words is to average their context. Tweaking word2vec to make greater use of the context may do slightly better, but it’s unclear.

Revisiting Selectional Preferences for Coreference Resolution (Heinzerling et al., 2017)

It seems intuitive that a coreference system could benefit from information about what nouns a verb selects for, but experiments on explicitly adding a representation of it to a neural system does not lead to gains, implying it is already learning them or they are not useful.

A causal framework for explaining the predictions of black-box sequence-to-sequence models (Alvarez-Melis et al., 2017)

To explain structured outputs in terms of which inputs have most impact, treat it as identifying components in a bipartite graph where weights are determined by perturbing the input and observing the impact on outputs.

Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog (Kottur et al., 2017)

Constraining the language of a dialogue agent can improve performance by encouraging the use of more compositional language.

Mr. Bennet, his coachman, and the Archbishop walk into a bar but only one of them gets recognized: On The Difficulty of Detecting Characters in Literary Texts (Vala et al., 2015)

With some tweaks (domain-specific heuristics), coreference systems can be used to identify the set of characters in a novel, which in turn can be used to do large scale tests of hypotheses from literary analysis.

A Factored Neural Network Model for Characterizing Online Discussions in Vector Space (Cheng et al., EMNLP 2017)

A proposal for how to improve vector representations of sentences by using attention over (1) fixed vectors, and (2) a context sentence.

Getting the Most out of AMR Parsing (Wang and Xue, EMNLP 2017)

Two ideas for improving AMR parsing: (1) take graph distance into consideration when generating alignments, (2) during parsing, for concept generation, generate individual concepts in some cases and frequently occurring subgraphs in other cases.