Taught by Herlock Rahimi
This graduate mini-course explores the differential geometric structure of probability distributions. Key topics include information geometry, exponential families, optimal transport, statistical estimation, and their applications to learning, optimization, and reinforcement learning.
Introduction to entropy, KL divergence, and estimation techniques (MLE, KL, Wasserstein). Covers differential geometry basics like tangent vectors, Riemannian metrics, and optimization on curved spaces.
Download Slides (PDF)Two-Gaussian EM and non-EM estimation, smooth manifolds, connections, Bregman divergences, and dual coordinate systems including their application to mirror descent and RL.
Download Slides (PDF)Explores α-connections, divergence-induced metrics, Bregman divergence, and Legendre duality. Canonical divergences and the Pythagorean theorem in dually flat spaces are introduced.
Download Slides (PDF)Describes Fisher metric, dual connections, KL divergence geometry, EM algorithm as alternating projections, and natural gradient methods for optimization and learning.
Download Slides (PDF)Introduces Monge and Kantorovich formulations, Wasserstein distances, Otto calculus, displacement geodesics, and contrasts these with Fisher geometry to develop a unified view.
Download Slides (PDF)