Herlock Rahimi

Information Geometry

Taught by Herlock Rahimi

Spring 2025 · Yale University

About the Course

This graduate mini-course explores the differential geometric structure of probability distributions. Key topics include information geometry, exponential families, optimal transport, statistical estimation, and their applications to learning, optimization, and reinforcement learning.

Course Sessions

Session 1: Information, Estimation, and Geometry

Introduction to entropy, KL divergence, and estimation techniques (MLE, KL, Wasserstein). Covers differential geometry basics like tangent vectors, Riemannian metrics, and optimization on curved spaces.

Download Slides (PDF)

Session 2: Gaussian Mixture Models and Geometric Tools

Two-Gaussian EM and non-EM estimation, smooth manifolds, connections, Bregman divergences, and dual coordinate systems including their application to mirror descent and RL.

Download Slides (PDF)

Session 3: Dual Connections, Divergences, and Dually Flat Spaces

Explores α-connections, divergence-induced metrics, Bregman divergence, and Legendre duality. Canonical divergences and the Pythagorean theorem in dually flat spaces are introduced.

Download Slides (PDF)

Session 4: EM Algorithm, Natural Gradient, and Statistical Geometry

Describes Fisher metric, dual connections, KL divergence geometry, EM algorithm as alternating projections, and natural gradient methods for optimization and learning.

Download Slides (PDF)

Session 5: Optimal Transport Meets Information Geometry

Introduces Monge and Kantorovich formulations, Wasserstein distances, Otto calculus, displacement geodesics, and contrasts these with Fisher geometry to develop a unified view.

Download Slides (PDF)