Reordering training sentences for word vectors may impact their usefulness for downstream tasks.
To leverage out-of-domain data, learn multiple sets of word vectors but with a loss term that encourages them to be similar.
Surprisingly, word2vec (negative skipgram sampling) produces vectors that point in a consistent direction, a pattern not seen in GloVe (but also one that doesn’t seem to cause a problem for downstream tasks).
By introducing a new loss that encourages sparsity, an auto-encoder can be used to go from existing word vectors to new ones that are sparser and more interpretable, though the impact on downstream tasks is mixed.
Vectors for words and entities can be learned by trying to model the text written about the entities. This leads to word vectors that score well on similarity tasks and entity vectors that produce excellent results on entity linking and question answering.
By switching from representing words as points in a vector space to multiple gaussian regions we can get a better model, scoring higher on multiple word similarity metrics than a range of techniques.
A proposal for how to improve vector representations of sentences by using attention over (1) fixed vectors, and (2) a context sentence.