Causal Inference with Modern Machine Learning Methods
Doctoral-level introduction to causal inference at the intersection of machine learning, focusing on theoretical foundations and recent research developments.
Course Description
Across disciplines, causal inference is a cornerstone of science, engineering, economics, and public policy. In medicine, we would like to understand how a patient might have responded if we offered a different treatment. In engineering system design and optimization, we would like to understand how the system would behave if we made different design choices. In public policy, we are constantly asking if different taxes, laws, regulations, or programs might improve (or hurt) society at large. Correctly answering such questions can help us make more informed and better decisions. Data are frequently collected and analyzed in the process of seeking statistically quantified answers to these questions. In contemporary applications, these data are decidedly large-scale, complex, and high-dimensional. These call for an urgent need in designing modern statistical/machine learning methods for causal inference.
Recently, an exciting set of tools at the intersection of causal inference and machine learning has emerged to tackle these types of questions in these settings. This course is a doctoral-level introduction to these tools. Our emphasis is primarily on studying the statistical/machine learning tools to formally analyze these types of methods, with the goal of empowering students to have an in-depth understanding of cutting-edge research in this area and contribute their own new (theoretically justified) methods to the field.
The first 2.5 weeks of the course will focus on basic concepts and methods in causal inference at the level of the Imbens and Rubin (2015) book. For the rest of the semester, we will move to recent developments for causal inference using modern machine learning methods and the content will primarily be based on recent research papers.
Learning Objectives
Upon successful completion of this course, students will be able to:
- Explain the core concepts and challenges in causal inference under the potential outcomes framework;
- Apply the most recent developments of modern statistical/machine learning methods to some core causal inference problems;
- Demonstrate and improve the ability to develop and justify the statistical/machine learning methods with mathematical rigor when applying/adapting them to causal inference questions.
Project Updates
-
Potential Outcome Framework and Matching Estimators
An introduction to the Neyman-Rubin causal model and matching methods
-
Paper review: Improving randomized controlled trial analysis via data-adaptive borrowing
A deep dive into how machine learning and adaptive lasso can enhance RCTs by selectively borrowing information from external controls.
-
Lecture 16: Canonical Gradient and Efficient Influence Curve
Notes by Rachael Phillips for PB HLTH 290, Spring 2019
-
Augmented IPW and Double Robustness
AIPW estimator, double robustness, and cross-fitting
-
FWL Theorem
Frisch-Waugh-Lovell theorem
-
Deriving the IPW Estimator: From One RCT to Infinite RCTs under Unconfoundedness
An intuitive derivation of the Inverse Probability Weighting (IPW) estimator from a single RCT to multiple RCTs and observational data.
-
When Does Linear Regression Yield Causal Inference?
Intuition on ATE estimation under unconfoundedness and overlap