Assignments
The aim of the assignments and shared tasks is to build basic understanding and advanced implementation skills needed to build cutting-edge inference systems for language models, culminating with a shared task that demonstrates these abilities.
Read all the instructions on this page carefully
You are responsible for reading these instructions and following them carefully. If you do not, you may be marked down as a result.
Assignment Policies
Working in Teams: There are 4 assignments in the class. All assignments should be submitted individually.
Submission Information: To submit your assignment you must submit via Canvas a zip file containing:
- your code: This should be in a directory “code” in the top directory unless specified otherwise.
- your report: This should be a PDF file named “report.pdf” in the top directory.
Late Policy: Late submissions will be penalized at 10% per day late, up to a maximum of 5 days. Each student has 3 late days to use throughout the semester without penalty.
Homework Assignments (60%)
There will be four homework assignments throughout the semester, each worth 15% of your final grade.
Homework 1: Language Model Basics and Probability
- Released: Thursday 8/28
- Due: Tuesday 9/9
- Topics:
- Math homework with probability review
- Small coding section to run inference on models with Hugging Face / VLLM
Homework 2: Decoding Methods Implementation
- Released: Thursday 9/11
- Due: Tuesday 10/7
- Topics:
- Implementation of beam search
- Implementation of sampling methods (mirostat, temperature, top-p, top-k)
- Comparison on shared tasks
Homework 3: LLM System Building
- Released: Thursday 10/9
- Due: Tuesday 11/11
- Topics:
- Build an LLM system with code interpreter
- Implement reward model
- System visualization
- Benchmark variants on shared tasks
Homework 4: Hardware and Optimization
- Released: Tuesday 11/11
- Due: Tuesday 12/2
- Topics:
- Transformer hardware mathematics
- Implementation of speculative decoding
- Implementation of KV caching
Shared Tasks (30%)
Performance Component (15%)
Performance will be evaluated relative to two baselines:
- Baseline A: Reasonably high-performance system
- Baseline B: Basic implementation
Grading scale:
- Best submission: 100%
- Above Baseline A: 90-100%
- Between Baseline A and B: 80-90%
- Below Baseline B: C-level or lower
Baseline A and B’s scores on the validation set will be released partway through the semester, to allow you to calibrate how well your system is performing.
You also must submit clearly documented code to reproduce your results.
Written Report (15%)
- Format: COLM-format, eight-page paper
- Due: Thursday 12/4
- Requirements:
- Introduction to tasks
- Related work discussion
- System description
- Original diagram/visualization of system
Quizzes (10%)
- One quiz per class
- Due before the next class starts
- Lowest 6 scores dropped
- 1-2 questions based on class content or readings
- No makeup quizzes (use the drop policy instead)