Graham Neubig

Associate Professor, Language Technology Institute, Carnegie Mellon University
Affiliated Faculty, Machine Learning Department, Carnegie Mellon University
Chief Scientist, All Hands AI
My research is concerned with language and its role in human communication. In particular, my long-term research goal is to break down barriers in human-human or human-machine communication through the development of natural language processing (NLP) technologies. This includes the development of technology for machine translation, which helps break down barriers in communication for people who speak different languages, and natural language understanding, which helps computers understand and respond to human language. Within this overall goal of breaking down barriers to human communication, I have focused on several aspects of language that both make it interesting as a scientific subject, and hold potential for the construction of practical systems. Specific areas of interest include:
- Multilingual Language Processing
- Machine Translation
- Syntactic and Semantic Analysis
- Cross-lingual Learning
- Natural Language Interfaces to Computers
- Natural Language to Code Generation
- Question Answering and Information Extraction
- Modeling Human-Computer or Human-Human Interaction
- Machine Learning for NLP
- Explainability and Interpretable Evaluation
- Neural Network Models for NLP
- Unsupervised and Semi-supervised Learning
Academic/Career History
- 7/2020-onward Carnegie Mellon University (CMU): Associate Professor
- 9/2016-7/2020 Carnegie Mellon University (CMU): Assistant Professor
- 4/2012-8/2016 Nara Institute of Science and Technology (NAIST): Assistant Professor
- 4/2010-3/2012 Kyoto University: Doctoral course in Intelligent Information Systems
- 4/2008-3/2010 Kyoto University: Master's course in Intelligent Information Systems
- 8/2006-3/2008 Hyogo Prefectural Government: Coordinator for International Relations
- 9/2005-7/2006 Tajima Agricultural High School: Assistant Language Teacher
- 8/2001-5/2005 University of Illinois, Urbana-Champaign: B.S. Computer Science
Papers
Here is a list of a few of my current favorite papers:
- Jiaxin Ge, Zora Zhiruo Wang, Xuhui Zhou, Yi-Hao Peng, Sanjay Subramanian, Qinyue Tan, Maarten Sap, Alane Suhr, Daniel Fried, Graham Neubig, Trevor Darrell.
AutoPresent: Designing Structured Visuals From Scratch (BibTex, Code/Data)
IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR). Nashville, USA. June 2025 (To Appear). - Lindia Tjuatja, Graham Neubig, Tal Linzen, Sophie Hao.
What Goes Into a LM Acceptability Judgment? Rethinking the Impact of Frequency and Length (BibTex, Code/Data)
Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL). Albuquerque, USA. April 2025. - Junpeng Liu, Tianyue Ou, Yifan Song, Yuxiao Qu, Wai Lam, Chenyan Xiong, Wenhu Chen, Graham Neubig, Xiang Yue.
Harnessing Webpage UIs for Text-Rich Visual Understanding (BibTex, Code/Data)
International Conference on Learning Representations. Singapore. April 2025. - Shota Onohara, Atsuyuki Miyai, Yuki Imajuku, Kazuki Egashira, Jeonghun Baek, Xiang Yue, Graham Neubig, Kiyoharu Aizawa.
JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation (BibTex, Code/Data)
Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL). Albuquerque, USA. April 2025. - Ian Wu, Patrick Fernandes, Amanda Bertsch, Seungone Kim, Sina Khoshfetrat Pakazad, Graham Neubig.
Better Instruction-Following Through Minimum Bayes Risk (BibTex)
International Conference on Learning Representations (ICLR). Singapore. April 2025.
All of my publications can be found on my publications page, and my most highly cited papers can be found on Google Scholar.
Other Links
- Slides for tutorials and classes can be found on my teaching page
- Software and resources that I've developed can be found on my software page.
- Tools for Natural Language Processing
- The Kyoto Free Translation Task: A task that can be used for evaluation of English-Japanese translation systems
- Japanese Parallel Data: A list of various data that can be used to create machine translation systms to/from Japanese