CS 11-747: Neural Networks for NLP

Efficiency Tricks for Neural Nets (1/28/2020)

Content:

Reading Material

Highly Recommended Reading: Notes on Noise Contrastive Estimation and Negative Sampling (Dyer 2014)
Reference: Importance Sampling (Bengio and Senécal, 2003)
Reference: Noise Contrastive Estimation (Mnih and Teh, 2012)
Reference: Negative Sampling (Goldberg and Levy, 2014)
Reference: Mini-batching Sampling-based Softmax Approximations (Zoph et al., 2015)
Reference: Class-based Softmax (Goodman 2001)
Reference: Hierarchical Softmax (Morin and Bengio 2005)
Reference: Error Correcting Codes (Dietterich and Bakiri 1995)
Reference: Binary Code Prediction for Language (Oda et al. 2017)
Reference: Seq2seq w/ Continuous Outputs (Kumar and Tsvetkov 2019)

Slides (from 2019): Efficiency Slides
Sample Code: Efficiency Code Examples