Distributional Semantics and Word Vectors (2/6/2020)
Content:
- Describing a word by the company that it keeps
- Counting and predicting
- Skip-grams and CBOW
- Evaluating/Visualizing Word Vectors
- Advanced Methods for Word Vectors
Reading Material
- Required Reading (for quiz): Goldberg Book Chapters 10-10.4
- Recommended Reading: Goldberg Book Chapters 10.5-11
- Reference: WordNet
- Reference: Linguistic Regularities in Continuous Representations (Mikolov et al. 2013)
- Reference: t-SNE (van der Maaten and Hinton 2008)
- Reference: Visualizing w/ PCA vs. t-SNE (Derksen 2016)
- Reference: How to use t-SNE effectively (Wattenberg et al. 2016)
- Reference: Evaluating Word Embeddings (Schnabel et al. 2015)
- Reference: Morphology-based Embeddings (Luong et al. 2013)
- Reference: Character-based Embeddings (Ling et al. 2015)
- Reference: Subword-based Embeddings (Wieting et al. 2016)
- Reference: Fasttext Toolkit (Bojankowski et al. 2017)
- Reference: Multi-prototype Embeddings (Reisinger and Mooney 2010)
- Reference: Non-parametric Multi-prototype Embeddings (Neelakantan et al. 2014)
- Reference: Cross-lingual Embeddings (Faruqui et al. 2014)
- Reference: Unsupervised Cross-lingual Embeddings w/ Common Words (Artetxe et al. 2017)
- Reference: Unsupervised Cross-lingual Embeddings w/ Distribution Matching (Zhang et al. 2017)
- Reference: Retrofitting to Lexicons (Faruqui et al. 2015)
- Reference: Sparse Word Embeddings (Murphy et al. 2012)
- Reference: De-biasing Word Embeddings (Bolukbasi et al. 2016)
Slides: Word Embedding Slides
Sample Code: Word Embedding Code Examples
Links to Word Embedding Toolkits