CS 11-731: Machine Translation and Sequence-to-sequence Models

Assignments

The aim of the assignment and project is to build the skills needed to do build cutting-edge systems or do cutting-edge research, culminating with a project that demonstrates these abilities through a project.

Read all the instructions on this page carefully.
You are responsible for reading these instructions and following them carefully. If you do not, you may be marked down as a result.

Assignment Policies

Working in Teams: Assignment 1 will be done individually, while assignment 2 and the final project must be done in teams of 2-3 (individual submissions will not be accepted for these classes). If you are having trouble finding a group after an initial survey of groups, the instructors will help find one.

Submission Information: To submit your assignment, please submit through Canvas:

your names and andrew IDs
a report: This should be up to 5 pages for the assignments, 9 pages for the projects. References are not included in the page count, and it is OK to submit appendices that include supplementary information such as hyperparameter settings or additional output examples, although there is no guarantee that the TAs will read them. Submissions that exceed the page count will be penalized one half grade.
a link to a github repository containing your code: Your github repository must be viewable to the TAs and instructor by the submission deadline. If your repository is private make it accessible to us (github IDs: neubig, antonisa, cindyxinyiwang, pmichel31415, shrutirij, xiamengzhou). If your repository is not visible to us, your assignment will not be considered complete, so if you are worried please submit well in advance of the deadline so we can confirm the submission is visible.

Prompt Submission of Assignments: Assignments that are late will be graded down one half-grade per day late (e.g. A to A-, A- to B+). Exceptions will be granted only for unforseen health-related issues or family emergencies that prevent you from working, in which case we will ask you to submit a doctor's note or other evidence of the emergency. Note that these are strict deadlines, and we will not be making exceptions for other reasons (including, but not exclusively, conflicting deadlines with other classes, conference travel, server crashes, etc.). It is highly encouraged that you finish the assignments several days early to make sure that you have some leeway to buffer against these eventualities.

Plagiarism/Code Reuse Policy: All assignments are expected to be conducted under the CMU policy for academic integrity. All rules here apply and violations will be subject to penalty including zero credit on the assignment, failing the course, or other disciplinary measures. In particular, in your implementation:

Pseudo-code provided by the TAs or instructor may be used freely without restriction.
You may not just re-use an existing implementation written by someone else. The implementation should basically be your own.
Fragments of code found online can be used (assuming the license so permits), but if they are significant, please cite these in your report. Failure to do so will be treated as being in violation of the assignment rules.
Code written by other students in the class not in your assignment group cannot be used.

Individual contribution policy: For the group projects, each group member must make a contribution. We will be checking your submitted github repository for these. If contributions are uneven, we reserve the right to assign different team members diferent grades. Also, because this is a class testing your ability to build MT and seq2seq models, contributors that solely contribute to pre-processing, analysis, or paper writing may have their grades discounted as well. If you believe it might not be clear who contributed what, please clarify in your submitted report.

Consulting w/ Instructors/TAs: For assignments and projects, you are free to consult as much as you want, any time you want with the instructors and TAs. That is what we're here for, and in no way is this considered cheating. In fact, if you don't have much experience with MT/seq2seq models previously, it may be necessary to liberally consult with the instructors and TAs to learn about how to do the implementation and finish the assignments. So please do so.

Because this is a project-based course I assume that many of the students taking the course will be interested in turning their assignments or project into research papers. In this case, if you have received useful advice from the instructor or TAs that made the project significantly better, consider inviting them to be co-authors on the paper. Of course, you do not need to do so just because the paper is a result of the class, only if you feel that their advice or help made a contribution.

Details of Each Assignment

Assigment 1 (Original Deadline: 9/30, Extended Deadline 10/3)

Assignment 1 will involve starting with some template code and implementing a sequence-to-sequence model and attempting to achieve the best accuracy possible on a specific benchmark. You can download the assignment package, with specific instructions, below.

For your report, please turn in a report with a description of: (1) what you did, (2) what the numerical results were, (3) an analysis of the outputs; what problems did you see in the outputs, and if you made any improvements what improved? Also, please make sure that your github is accessible to the TAs, and please turn in your translations for the development and test sets. All of the above elements must be turned in by the deadline for full credit. In addition, the deadline has been extended due to logistics problems. For people who nonetheless turn in the homework on time, 2% extra credit will be awarded on the assignment.

Assignment 1 Package (v1)

A+: Exceptional or surprising. Goes far beyond most other submissions.
A: A complete implementation that improves above the baseline by adding some twist described in the class, along with corresponding analysis of why the results are improving.
A-: A complete implementation with result numbers that more or less match the baseline, and an analysis of the results.
B+: A working implementation exists but the numbers aren't able to quite match the baseline, or there is very little analysis or insight about the results.
B or below: The implementation is not working properly, report is severely incomplete (grade will depend on the level of incompleteness).

Assignment 2 (Due: 11/5)

See the separate page for assignment 2 instructions.

Final Project (Due Date 12/10)

The final project work will be expected to be a novel research contribution that either (1) introduces new techniques for an existing tasks using a significant amount of technical sophistication utilizing one of the more advanced techniques introduced in the class, or (2) tackles a new MT or sequence-to-sequence task with a neural network model that is motivated by the unique problems posed by the application domain. The grading rubric is as follows:

A+: Exceptional or surprising. Goes far beyond most other submissions.
A: A respectable research contribution that is novel and effective, and could be submitted largely as-is as a paper to an academic conference.
A-: A respectable research contribution that has some small incomplete parts, but is largely complete and promising.
B+: An idea that is novel, but the results may not be there yet, or the analysis is short.
B or B-: Results, analysis, or novelty are lacking.
C+ or below: Clear lack of effort or incompleteness.

Machine Translation andSequence-to-sequence Models

Assignment Policies

Details of Each Assignment

Assigment 1 (Original Deadline: 9/30, Extended Deadline 10/3)

Assignment 2 (Due: 11/5)

Final Project (Due Date 12/10)

Machine Translation and
Sequence-to-sequence Models