Problem Addressed 🧭
Survey experiments routinely include placebo conditions, yet little systematic guidance exists for choosing or constructing them. This paper documents inconsistent use of placebos in published work and clarifies what placebos are actually meant to adjust for: nonspecific effects (NSEs), the incidental impacts of ancillary features of experiments.
🔎 What Current Practice Looks Like
A review of published survey experiments finds placebos are used inconsistently, leaving open when and how placebo choices meaningfully affect estimates.
🧠 Why Placebos Matter—and Why Choice Is Difficult
Placebos are intended to account for NSEs, but researchers typically lack precise knowledge about which NSEs matter. When the specific NSEs are unknown, choosing a single placebo risks arbitrarily adjusting for some ancillary features and not others.
🧪 How Placebos Were Generated and Tested
- A generative language model (GPT-2), trained on a corpus of over 1 million internet news pages, was used to create placebo text vignettes.
- Using this model, 5,000 distinct placebo vignettes were generated.
- Two survey experiments were administered to evaluate the approach (total N = 2,975).
📊 Key Findings
- When precise knowledge of NSEs is absent, averaging over a large corpus of placebos provides a principled, agnostic way to account for unknown ancillary effects.
- Automated generation (via GPT-2) makes it feasible to produce thousands of placebos and thereby minimize researcher discretion in placebo selection.
- The experimental results illustrate the practical viability of this agnostic averaging strategy for survey experiments.
💡 Practical Tools and Recommendations
The paper concludes with concrete tools for incorporating computer-generated placebo text vignettes into survey experiments and offers recommendations for best practice when using automated, agnostic placebo construction.