Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning (Tsvetkov et al., 2016)

Reordering training sentences for word vectors may impact their usefulness for downstream tasks.

Neural Semantic Parsing over Multiple Knowledge-bases (Herzig et al., 2017)

Training a single parser on multiple domains can improve performance, and sharing more parameters (encoder and decoder as opposed to just one) seems to help more.

A Local Detection Approach for Named Entity Recognition and Mention Detection (Xu et al., 2017)

Effective NER can be achieved without sequence prediction using a feedforward network that labels every span with a fixed attention mechanism for getting contextual information.

Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme (Zheng et al., 2017)

By encoding the relation type and role of each word in tags, an LSTM can be applied to relation extraction with great success.

Abstractive Document Summarization with a Graph-Based Attentional Neural Model (Tan et al., 2017)

Neural abstractive summarisation can be dramatically improved with a beam search that favours output that matches the source document, and further improved with attention based on PageRank, with a modification to avoid attending to the same sentence more than once.

Error-repair Dependency Parsing for Ungrammatical Texts (Sakaguchi et al., 2017)

Grammatical error correction can be improved by jointly parsing the sentence being corrected.

Attention Strategies for Multi-Source Sequence-to-Sequence Learning (Libovicky et al., 2017)

To apply attention across multiple input sources, it is best to apply attention independently and then have a second phase of attention over the summary vectors for each source.

Robust Incremental Neural Semantic Graph Parsing (Buys et al., 2017)

A neural transition based parser with actions to create non-local links can perform well on Minimal Recursion Semantics parsing.

A Two-Stage Parsing Method for Text-Level Discourse Analysis (Wang et al., 2017)

Breaking discourse parsing into separate relation identification and labeling tasks can boost performance (by dealing with limited training data).

A Transition-Based Directed Acyclic Graph Parser for UCCA (Hershcovich et al., 2017)

Parsing performance on the semantic structures of UCCA can be boosted by using a transition system that combines ideas from discontinuous and constituent transition systems, covering the full space of structures.