CDSS 94 · Building Thoughtful AI Systems

Module 1 · Fundamentals

What We Owe Machines

January 26, 2026

The hard problem in AI isn't making machines smarter, it's teaching them to handle problems without right answers. This lecture traces how we learned to teach machines at all, and where that project stands today.

Slides Lecture Notes Recording

Lecture notes by Riccardo Colletti

The Lifecycle of a Language Model

February 2, 2026

Understanding the full training pipeline from pretraining to deployment.

Slides Lecture Notes Recording

Module 2 · Post-Training

Post-Training Foundations

February 9, 2026 Final Project A Due

RLHF and Reward Learning

February 16, 2026 Project 1 Released

Alignment Methods & Model Behavior

February 23, 2026 Final Project B Due

Evals as Research

March 2, 2026

Module 3 · Reasoning & Agents

Search, Planning, Memory

March 9, 2026 Project 1 Due

Tool Use and Verification

March 16, 2026 Project 2 Released

Spring Break

March 23, 2026

No class this week. Enjoy your break!

Multi-Agent Systems

March 30, 2026 Final Project C Due

Module 4 · Product & Research

Product Design & Development Workshop

April 6, 2026 Project 2 Due

Product Workshop (Continued)

April 13, 2026

Guest Lecture

April 20, 2026

Guest Lecture

April 27, 2026

Demo Day 🎉

May 4, 2026 · RRR Week Final Project Due

Date

Lecture

Technical Work

Materials

Assignments Due

Jan

Week 1 · Fundamentals

What We Owe Machines

T1a: Inference Playground

T1b: Scaling Laws

Slides Notes

—

Feb

Week 2 · Fundamentals

The Lifecycle of a Language Model

T2a: Build a Context-Aware Assistant

T2b: Inference Deep-Dive

Slides

—

Feb

Week 3 · Post-Training

Post-Training Foundations

T3a: Datagen Pipelines & Quality

T3b: SFT from Scratch

—

Final Project A Due

Feb

Week 4 · Post-Training

RLHF and Reward Learning

T4a: DPO in Practice

T4b: Toy Reward Modeling

—

Project 1 Released

Feb

Week 5 · Post-Training

Alignment Methods & Model Behavior

T5a: Refusal Classifier

T5b: Constitutional AI Loop

—

Final Project B Due

Mar

Week 6 · Post-Training

Evals as Research

T6a: Evaluation Design

T6b: Benchmark Hacking

—

Mar

Week 7 · Reasoning & Agents

Search, Planning, Memory

T7a: Planning Agent

T7b: AlphaEvolve

—

Project 1 Due

Mar

Week 8 · Reasoning & Agents

Tool Use and Verification

T8a: Coding Agent with Verification

T8b: Computer Use Lab

—

Project 2 Released

Mar

Spring Break

—

Mar

Week 9 · Reasoning & Agents

Multi-Agent Systems

T9a: Agent Telemetry

T9b: Multi-Agent Collaboration

—

Final Project C Due

Apr

Week 10 · Product

Product Design & Development Workshop

T10: Self-Play Experiment

—

Project 2 Due

Apr

Week 11 · Product

Product Design & Development Workshop (Ct'd)

—

Apr

Week 12 · Guest Lecture

TBA

—

Apr

Week 13 · Guest Lecture

TBA

—

May
4
RRR Week
Demo Day 🎉
—
—

                            Final Project Due
                        

Final Project

Design and build a product demo AI product prototype that demonstrates mastery of course concepts. Teams of 2-4 students will propose, build, and present a demo.

Team Formation Feb 9

Proposal Due Feb 23

WIP Check-in Mar 30

Demo Day May 4

The final project is your opportunity to explore a topic of your choice in depth. Projects should demonstrate technical sophistication and original thinking. Successful projects might included novel alignment techniques, agent architectures, evaluation frameworks, and creative applications of post-training methods.

Milestones

Feb 9

Checkpoint A: Team formation and initial idea submission

Feb 23

Checkpoint B: Detailed proposal with methodology and timeline

Mar 30

Checkpoint C: Work-in-progress presentation and feedback session

May 4

Final submission and Demo Day presentation

About the Course

This project-based course teaches you to design, build, post-train, and evaluate agentic AI systems. We connect core technical foundations (model architecture, retrieval-augmented generation, tool use, multi-agent coordination) with hands-on labs and a team capstone project.

You'll gain practical experience with model deployment frameworks, post-training pipelines (SFT, RL, context engineering), evaluation systems, and model-context protocols. The course culminates in building an end-to-end minimum viable AI product.

Emphasis is placed on engineering rigor, creative problem-solving, and deployment at scale.

Learning Outcomes

By the end of this course, you will be able to:

Design AI systems that are reliable, maintainable, and cost-effective
Debug and align models in production using modern evaluation and feedback frameworks
Post-train open-source LLMs using SFT, RL, and context engineering via TinkerAPI
Translate research into practice, whether founding an AI venture or shipping at scale
Navigate ethical considerations in deploying autonomous systems
Prototype and iterate on applications using modern LLM and agentic architectures

Who This Course Is For

Upper-division undergraduates and early graduate students who want to found or join early-stage AI startups, or work on applied ML engineering at major technology companies.

Expected background: Solid Python programming experience (CS 61A/88 + CS 61B or equivalent). Prior ML exposure recommended (CS 182, CS 189, CS 185, or Data 100) but not required. Strong motivation to apply AI research to real-world systems.

Course Structure

Module 1: Fundamentals Foundations of LLMs, architectures, inference
Module 2: Post-Training SFT, RLHF, alignment techniques, TinkerAPI
Module 3: Reasoning and Agents Chain-of-thought, tool use, multi-agent systems
Module 4: Product and Research Deployment, evaluation, scaling, ethics

Navigating the Course

We keep things simple: one channel for communication, one for submissions, and one site for all course content. Everything you need lives in one of these four places:

Communication: Slack (see welcome email for invite link)
Course website: posttraining.ai
Assignments: Gradescope (enroll with code 6X7K5K)
Lectures: In-person, recordings available on course site

Grade Breakdown

50% Final Project (semester-long, 2 checkpoints, Demo Day May 4th)

20% Project 1 & Project 2 (10% each)

20% Attendance (in-person required, 2 excused absences allowed)

10% Documentation (weekly progress sharing, starting week 2)

Assignments

Attendance (20%)

We believe you learn best by being in class and discussing with your peers. In-person attendance is required with 2 excused absences allowed.

Documentation (10%)

Each week you'll share something you learned: a blog post, tweet, TikTok, LinkedIn post, YouTube video. The format is yours; the goal is building in public.

Labs (ungraded)

Weekly technical work (T1–T10) released alongside lectures. These are hands-on labs that reinforce lecture topics and prepare you for the graded projects. Ungraded but strongly recommended.

Projects (graded)

Project 1 focuses on post-training pipelines. Project 2 focuses on agentic systems. Both are graded on completion and quality.

The Final Project is a semester-long endeavor where you ideate, design, prototype, and productionize an AI system. Two checkpoints throughout the semester, culminating in Demo Day on May 4th.

Acknowledgements

Cognition for providing compute credits
Thinking Machines for TinkerAPI access