Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples (Vidur Joshi et al., 2018)
Virtually all systems trained using data have trouble when applied to datasets that differ even slightly - even switching from Wall Street Journal text to New York Times text can hurt parsing performance slightly. Extensive work has explored how to adapt to new domains (including one of my own), but generally these approaches only made up a fraction of the gap in performance.
This paper shows two interesting new approaches to this issue:
- Use ELMo, a type of word representation trained on massive amounts of text.
- Train a span-based parser with partial annotations.
The first is straightforward, and further demonstrates the effectiveness of ELMo. To give a sense of how much this helps, the Charniak parser goes from 92 on the WSJ to 85 on the Brown corpus, while this model goes from 94 to 90. The second idea takes advantage of a recent parsing model with a simple approach:
- Independently assign a score to every span of a sentence, indicating whether it is part of the parse.
- Find the maximum scoring set of spans using a dynamic program.
The structure of the scoring step allows for a convenient form of partial annotations. Simply label the tricky spans in a sentence (e.g. to indicate where a prepositional phrase attaches / does not attach). During training on partially annotated sentences, only the labeled spans are used to update the model. This gives dramatic gains across multiple datasets.
Citation
@InProceedings{Joshi:2018:ACL,
author = {Joshi, Vidur and Peters, Matthew and Hopkins, Mark},
title = {Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples},
title: = {Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples},
booktitle = {ACL},
year = {2018},
url = {https://arxiv.org/abs/1805.06556},
}