Sequence Labeling (1/20/2022)
Lecture: (by Graham Neubig)
- Example Sequence Classification/Labeling Tasks
- Overall Framework of Sequence Classification/Labeling
- Sequence Featurization Models (BiRNN, Self Attention, CNNs)
Language in 10: None yet?
Slides: Sequence Labeling Slides
Discussion: There will be a codewalk through Assignment 1
References:
- Software: PyTorch
- Software: langid.py
- Software: Udify
- Software: Stanza
- Data: LTI Language Identification Corpus
- Data: MLDoc
- Data: PAWS-X
- Data: XNLI
- Data: Cross-lingual Sentence Classification
- Data: Universal Dependencies/POS Tags
- Data: CoNLL NER Tasks
- Data: XTREME
- Data: XGLUE
- Reference: Automatic Language Identification in Texts: A Survey (Jauhiainen et al. 2018)
- Reference: Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus (Caswell et al. 2020)
- Reference: WikiAnn (Pan et al. 2017)