Split Samples Improve Power and Replication for Causal Studies, With Limits

Causal Inference Preregistrationsplit samplestatistical powerMethodology @Pol. An.3 Stata files Dataverse

🔧 How the split-sample procedure works

Researchers send their dataset to an independent third party that randomly creates a training sample and a withheld testing sample. All model building, hypothesis selection, and revisions occur using the training sample, allowing feedback from colleagues, editors, and referees. Once the paper is accepted, the pre-specified analysis is applied to the testing sample, and those testing-sample results are the ones published.

📊 What the simulations show

Under empirically relevant settings, the split-sample method yields greater statistical power than a conventional preanalysis plan (PAP).
The primary mechanism for this power gain is a reduced chance that relevant hypotheses are never tested during research workflows.
The advantage is strongest in settings where outcomes of interest are uncertain and exploration is common.

⚖️ When this approach is most and least appropriate

Well-suited for exploratory analyses with substantial uncertainty about outcomes and hypotheses.
Not recommended when treatments are very costly and available sample size is severely limited, because withholding a testing sample can make inference underpowered.

🔍 How to interpret the method

The procedure can be seen as enabling direct replication: the testing sample functions as an independent confirmation of results developed on the training sample.

🛠️ Practical considerations for implementation

Requires an independent third party to perform the random split and hold the testing data until acceptance.
Allows iterative improvement and feedback on analyses without compromising the credibility of final published estimates.
Feasibility issues and implementation logistics (data transfer, pre-specification of analysis on the training set, and journal workflows) are discussed in detail.

Why it matters

This split-sample protocol offers a pragmatic middle ground between exploratory work and strict preanalysis plans: it preserves opportunities for refinement and feedback while producing published results that come from an independent test, improving credibility and—under many realistic conditions—statistical power.

Article card for article: Using Split Samples to Improve Inference on Causal Effects

Using Split Samples to Improve Inference on Causal Effects was authored by Marcel Fachamps and Julien Labonne. It was published by Cambridge in Pol. An. in 2017.