Assignments

The aim of the assignments and shared tasks is to build basic understanding and advanced implementation skills needed to build cutting-edge inference systems for language models, culminating with a shared task that demonstrates these abilities.

Read all the instructions on this page carefully
You are responsible for reading these instructions and following them carefully. If you do not, you may be marked down as a result.

Assignment Policies

Working in Teams: There are 4 assignments in the class. All assignments should be submitted individually.

Submission Information: To submit your assignment you must submit via Canvas a zip file containing:

your code: This should be in a directory “code” in the top directory unless specified otherwise.
your report: This should be a PDF file named “report.pdf” in the top directory.

Late Policy: Late submissions will be penalized at 10% per day late, up to a maximum of 5 days. Each student has 3 late days to use throughout the semester without penalty.

Homework Assignments (60%)

There will be four homework assignments throughout the semester, each worth 15% of your final grade.

Homework 1: Language Model Basics and Probability

Released: Thursday 8/28
Due: Tuesday 9/9
Topics:
- Math homework with probability review
- Small coding section to run inference on models with Hugging Face / VLLM

Homework 2: Decoding Methods Implementation

Released: Thursday 9/11
Due: Tuesday 10/7
Topics:
- Implementation of beam search
- Implementation of sampling methods (mirostat, temperature, top-p, top-k)
- Comparison on shared tasks

Homework 3: LLM System Building

Released: Thursday 10/9
Due: Tuesday 11/11
Topics:
- Build an LLM system with code interpreter
- Implement reward model
- System visualization
- Benchmark variants on shared tasks

Homework 4: Hardware and Optimization

Released: Tuesday 11/11
Due: Tuesday 12/2
Topics:
- Transformer hardware mathematics
- Implementation of speculative decoding
- Implementation of KV caching

Shared Tasks (30%)

Performance Component (15%)

Performance will be evaluated relative to two baselines:

Baseline A: Reasonably high-performance system
Baseline B: Basic implementation

Grading scale:

Best submission: 100%
Above Baseline A: 90-100%
Between Baseline A and B: 80-90%
Below Baseline B: C-level or lower

Baseline A and B’s scores on the validation set will be released partway through the semester, to allow you to calibrate how well your system is performing.

You also must submit clearly documented code to reproduce your results.

Written Report (15%)

Format: COLM-format, eight-page paper
Due: Thursday 12/4
Requirements:
- Introduction to tasks
- Related work discussion
- System description
- Original diagram/visualization of system

Quizzes (10%)

One quiz per class
Due before the next class starts
Lowest 6 scores dropped
1-2 questions based on class content or readings
No makeup quizzes (use the drop policy instead)