Introduces fundamentals of information theory and its applications to contemporary problems in statistics, machine learning, and computer science. A thorough study of information measures, including Fisher information, f-divergences, their convex duality, and variational characterizations. Covers information-theoretic treatment of inference, hypothesis testing and large deviations, universal compression, channel coding, lossy compression, and strong data-processing inequalities.
Methods are applied to deriving PAC-Bayes bounds, GANs, tokenization and quantization of LLMs, and regret inequalities in machine learning, metric and non-parametric estimation in statistics, communication complexity, and computation with noisy gates in computer science. Fast-paced journey through a recent textbook with the same title. For a communication-focused version, consider 6.7470.
By appointment, email yp@mit.edu
Tuesdays, 9-10a (room: see weekly announcement)
Tuesdays, 4-5pm (room: see weekly announcement)
Component | Weight | Details |
---|---|---|
Weekly Problem Sets | 50% | Due Wednesdays at 10pm (unless stated otherwise) |
Midterm Exam | 45% | In-class exam |
Participation Bonus | 5% | Class interaction, office hours, independent reading |
Final Exam | None | Last PSet is larger and worth double points |
This is a graduate class, so interaction in class and office hours, independent reading, and discussing research projects all contribute to grade (participation bonus).