Representation 2 - Multi-task, Multi-domain, and Multi-lingual Learning (9/22/2022)
- Multi-task Learning
- Domain Adaptation and Robustness
- Multi-lingual Learning
- Recommended Reading: Understanding Dataset Shift
- Reference: Multi-task Learning (Caruana 1997)
- Reference: Multilingual NMT (Neubig and Hu 2018)
- Reference: Multilingual BERT (Wu and Dredze 2019)
- Reference: Domain Tag (Chu et al. 2017)
- Reference: Language Tag (Johnson et al. 2017)
- Reference: Cross-lingual Model Capacity (Conneau et al. 2020)
- Reference: Separate Encoders and Decoders (Firat et al. 2016)
- Reference: Adapters (Houlsby et al. 2019)
- Reference: Regularization for Fine-tuning (Barone et al. 2017)
- Reference: Soft Parameter Tying (Duong et al. 2015)
- Reference: Selective Parameter Adaptation (Zoph et al. 2016)
- Reference: Parameter Sharing in Transformers (Sachan et al. 2018)
- Reference: Domain Adversarial Training (Ganin et al. 2015)
- Reference: Temperature-based Weighting (Arivazhagan et al. 2019)
- Reference: Multi-task Weighting (Kendall et al. 2018)
- Reference: Optimized Task Weighting (Dery et al. 2021)
- Reference: Choosing Transfer Languages (Lin et al. 2019)
- Reference: Transferability Across Tasks (Vu et al. 2020)
- Reference: Distributionally Robust Language Modeling (Oren et al. 2019)
- Reference: Distributionally Robust Machine Translation (Zhou et al. 2021)
- Reference: Dealing w/ Script Differences (Rijhwani et al. 2019)
- Reference: Annotation Projection (Yarowsky et al. 2001)
- Reference: Multilingual Language Modeling (Tsvetkov et al. 2016)
- Reference: Multilingual Structured Prediction (Meng et al. 2019)
Slides: Multitask Slides