🔎 What This Paper Shows
Propensity score matching (PSM), a widely used preprocessing tool for causal inference, frequently does the opposite of its intended goal: it can increase imbalance, reduce efficiency, heighten model dependence, and introduce bias.
📊 Why PSM Fails
- The core problem is methodological: PSM attempts to approximate a completely randomized experiment. Other matching methods instead approximate a fully blocked randomized experiment, which is typically more efficient.
- Because PSM targets complete randomization, it is uniquely blind to the large portion of covariate imbalance that can be removed by approximating full blocking with alternative matching approaches.
âś… Key Findings
- PSM often increases imbalance rather than reducing it.
- PSM can worsen statistical efficiency and increase reliance on outcome-modeling (model dependence).
- In some datasets that are already balanced enough to resemble complete randomization—either originally or after pruning—PSM behaves like random matching and can increase imbalance even relative to the raw data.
🔍 What This Means for Practice
- These results indicate that researchers should prefer other matching methods that approximate full blocking when the goal is to reduce imbalance and improve causal estimates.
- Propensity scores are not without value, however; they still have productive uses outside the specific role of matching for approximating complete randomization.
📌 Takeaway
Rethinking the default use of PSM is crucial: matching strategies that target blocked designs typically deliver better balance and more reliable causal inferences than propensity score matching.