Dropping Manipulation-Check Failures Can Produce Serious Bias

Insights from the Field

manipulation check

experimental design

bounds

causal inference

treatment effects

A Note on Dropping Experimental Subjects Who Fail a Manipulation Check was authored by Peter M. Aronow, Jonathon Baron and Lauren Pinson. It was published by Cambridge in Pol. An. in 2019.

🔍 What's at stake

Dropping subjects after random assignment because they fail a manipulation check is common across the social sciences, usually intended to restrict estimates to those thought to understand the prompt. This practice can, however, introduce serious bias and obscures what the experiment actually reveals unless handled differently.

🧪 How the problem was analyzed

Extends theoretical results from Zhang and Rubin (2003) and Lee (2009) to settings with multiple treatments.
Derives sharp bounds for potential outcomes for the subpopulation that would pass the manipulation check regardless of treatment assignment (the always-pass group).
Shows how these bounds can be used to characterize the limits of inference when subjects are discarded based on manipulation-check outcomes.

📌 Key findings

Dropping subjects who fail a manipulation check after treatment assignment can produce substantial and systematic bias in estimated treatment effects.
The sharp bounds for potential outcomes among always-pass units may be very wide or even infinite, indicating that the inferential target of ‘‘effects among those who would pass’’ is often not identifiable from the data.
An applied replication of Press, Sagan, and Valentino (2013) that retains subjects who failed the manipulation check finds that the original study's reported effects are likely stronger than initially reported when discarding was used.

🔧 Practical recommendations

Emphasize estimation and reporting strategies that use all randomized subjects or that transparently bound effects rather than discarding data post-randomization.
Consider design changes to reduce reliance on post-treatment exclusions (for example, pre-treatment comprehension checks or designs that measure understanding without conditioning on treatment outcomes).
Report bounds and sensitivity to exclusion rules so readers can assess how much inferences depend on discarding subjects.

Why it matters: Dropping on the basis of manipulation checks can give a false sense of precision and restrict conclusions to an often-unreachable subpopulation; transparent bounding and design adjustments provide safer paths to credible causal claims.