Learning 1 - Modeling Long Sequences (11/11/2021)
- Extracting Features from Long Sequences
- Reference: RNN Language Models (Mikolov et al 2010)
- Reference: Larger Context RNNLMs (Mikolov and Zweig 2012)
- Reference: Self Attention over Previous Sentence (Voita et al. 2018)
- Reference: Self Attention over Previous Vectors (Dai et al. 2019)
- Reference: Compressive Transformer (Lillicrap et al. 2019)
- Reference: Sparse Transformers (Child et al. 2019)
- Reference: Adaptive Span Transformer (Sukhbaatar et al. 2019)
- Reference: Adaptively Sparse Transformers (Correia et al. 2019)
- Reference: Reformer (Kitaev et al. 2020)
- Reference: Linformer (Wang et al. 2020)
- Reference: Nystromformer (Xiong et al. 2021)
- Reference: Evaluation: Sentence Scrambling (Barzilay and Lapata 2008)
- Reference: Evaluation: Final Sentence Prediction (Mostafazadeh et al. 2016)
- Reference: Evaluation: Final Word Prediction (Paperno et al. 2016)
- Reference: Long Range Arena (Tay et al. 2020)
Slides: Document-level Processing Slides