CS 11-747: Neural Networks for NLP

Course Description

CS 11-747
Language Technologies Institute, School of Computer Science
Carnegie Mellon University
Tuesday/Thursday 4:30-5:50PM, Doherty Hall 1212

Instructors/TAs:

Instructor: Graham Neubig (gneubig@cs.cmu.edu)
  Office hours: Monday 4:00-5:00PM (GHC5409)
TAs: (cs11-747-fa2017-tas@cs.cmu.edu)
  Zhengzhong (Hector) Liu (Mon. 1:00-2:00PM, GHC5517)
  Xuezhe (Max) Ma (Tue. 12:00-1:00PM, GHC5517)
  Daniel Clothiaux (Fri. 9:00-10:00AM, GHC5505)
Questions and Discussion: Ideally in class or through piazza so we can share information with the class, but email and office hours are also OK.

Course Description

Neural networks provide powerful new tools for modeling language, and have been used both to improve the state-of-the-art in a number of tasks and to tackle new problems that were not easy in the past. This class will start with a brief overview of neural networks, then spend the majority of the class demonstrating how to apply neural networks to natural language problems. Each section will introduce a particular problem or phenomenon in natural language, describe why it is difficult to model, and demonstrate several models that were designed to tackle this problem. In the process of doing so, the class will cover different techniques that are useful in creating neural network models, including handling variably sized and structured sentences, efficient handling of large data, semi-supervised and unsupervised learning, structured prediction, and multilingual modeling.

Pre-requisites: 11-711 "Algorithms for NLP" or equivalent background is required. If you have not taken 11-711, I expect that you have enough NLP background to be able to complete its assignments (e.g. on n-gram language modeling, CKY parsing, and word alignment).

Class format: As the class aims to provide pratical implementation skills necessary to implement cutting-edge neural network models for NLP, but the classes and assignments will be implementation-focused. In general classes will take the following format:

Reading: Before the class, you will be given a reading assignment that you should read before coming to class that day.
Quiz: At the beginning of class, there will be a short quiz that tests your knowledge of the reading assignment. (These quizzes should be easy if the reading assignment has been completed and understood.)
Summary/Elaboration/Questions: The instructor will summarize the important points of the reading material, elaborate on details that were not included in the reading, and field any questions.
Code Walk: The TAs (or instructor) will walk through some demonstration code in DyNet that implements a simple version of the main concepts presented in the reading material.

Grading:

Quizzes: Worth 20% of the grade. Your lowest 2 quiz grades will be dropped. If you are sick or traveling on business (e.g. to a conference), send a doctor's note or conference registration confirmation to the TA list, and you will be excused.
Checkpoints: There will be 2 "checkpoint" assignments, each worth 20% of the grade. In case there are unforseen circumstances that don't let you turn in your assigment on time, 5 late days total over the two assignments will be allowed. Note that other than these late days, we will not be making individual exceptions and extending deadlines, so please try to be frugal with your late days and use them only if necessary.
Project: The final course project will be worth 40%.

Checkpoint Assignments/Project

The aim of the assignment and project is to build the skills needed to do cutting-edge research, culiminating with a project that demonstrates this by performing a research project.

Assignment/Project Policies: Assignments and the final project should be done in groups of 2-3. For the assignments and final project note who did what part of the project. Grades will be assigned based on contribution, so it is possible for people in the same group to get different grades. Send to the TAs:

your names and andrew IDs
a report: This should be 2-4 pages for the assignments, 6-8 pages for the project (not including references). Submit this as an attachment to your email.
a link to a github repository containing your code: Your github repository must be viewable to the TAs and instructor by the submission deadline. If your repository is private make it accessible to us (github IDs neubig, dbc148, XuezheMax, hunterhector). If your repository is not visible to us, your assignment will not be considered complete, so if you are worried please submit well in advance of the deadline so we can confirm the submission is visible.

Assignment details: Each of the assignmnets will consist of the following:

Checkpoint 1 (10/16) will involve performing a survey of your task of interest, collecting data, performing a simple implementation, and getting a number (e.g. accuracy) to quantify how well this implementation is doing. In the survey, explain the task that you would like to tackle in concrete terms, and also cover all of the relevant recent research on the topic and explain upon which piece of work you would like to base your second checkpoint assignment.
Checkpoint 2 (11/13) will involve reproducing the results of a state-of-the-art baseline model for the task of interest with code that you have implemented from scratch (e.g., you are not allowed to simply run existing code, nor copy large chunks from an exisitng implementation of the particular model of interest). In your report, also perform an analysis of what remaining errors this model makes (ideally with concrete examples of failure cases), and describe how you plan to create a new model for the final project that will address these error cases.
Project (12/14) work will be expected to be a novel research contribution that either (1) introduces new techniques for one of the existing tasks in the assignment using a significant amount of technical sophistication utilizing one of the more advanced techniques introduced in the class, or (2) tackles a new NLP task with a neural network model that is motivated by the unique problems posed by the application domain.

All assignments are expected to be conducted under the CMU policy for academic integrity. All rules here apply and violations will be subject to penalty including zero credit on the assignment, failing the course, or other disciplinary measures. In particular, in your implementation:

Pseudo-code provided by the TAs or instructor may be used freely without restriction.
You may not just re-use an existing implementation. The implementation should basically be your own.
Fragments of code found online can be used (assuming the license so permits), but if they are significant, please cite these in your report.
Code written by other students in the class not in your assignment group cannot be used.

Suggested Tasks

Below is a list of suggested NLP tasks that you may used for your assignments and projects. It is completely fine, and highly encouraged, to tackle other tasks, but please confirm with the instructor/TAs first so they can give you advice (ideally before you start Checkpoint 1).

Neural Networksfor NLP

Instructors/TAs:

Course Description

Checkpoint Assignments/Project

Suggested Tasks

Neural Networks
for NLP