FIND DATA: By Author | Journal | Sites   ANALYZE DATA: Help with R | SPSS | Stata | Excel   WHAT'S NEW? US Politics | Int'l Relations | Law & Courts
   FIND DATA: By Author | Journal | Sites   WHAT'S NEW? US Politics | IR | Law & Courts
If this link is broken, please report as broken. You can also submit updates (will be reviewed).
Recovering Time-Varying Population Distributions From Sparse Marginal Data
Insights from the Field
ecological inference
poststratification
Dirichlet
time-series
estsubpop
Methodology
Pol. An.
1 archives
Dataverse
Dynamic Ecological Inference for Time-Varying Population Distributions Based on Sparse, Irregular, and Noisy Marginal Data was authored by Devin Caughey and Mallory Wang. It was published by Cambridge in Pol. An. in 2019.

🔍 Problem and Motivation

Social scientists often need time-varying joint distributions—for example, to construct poststratification weights—but population data across time are typically sparse, irregular, and noisy. When different variables are observed on different schedules or only margins (not full joint distributions) are available, survey weights are frequently limited to the small subset of auxiliary variables with regularly observed joint data, leaving other useful information unused.

đź§° Model and Approach

A dynamic Bayesian ecological inference model is developed to estimate multivariate categorical distributions from sparse, irregular, and noisy marginal (or partially joint) data. The method combines three core components:

  • A Dirichlet sampling model for the observed margins conditional on the unobserved cell proportions.
  • A set of equations that encode the logical relationships among different population quantities.
  • A Dirichlet transition model for period-specific proportions that pools information across time periods.

📊 Illustration and Implementation

The method is illustrated by estimating annual U.S. phone-ownership rates by race and region using population data irregularly available between 1930 and 1960. An R package, estsubpop, implements the method to facilitate applied use and replication.

đź’ˇ Why It Matters

  • Enables dynamic ecological inferences about interior cells when only marginal or partially joint observations exist.
  • Pools information across time to improve estimates of period-specific multivariate categorical distributions.
  • Allows fuller use of auxiliary information for tasks such as survey poststratification where joint population distributions are intermittently observed.

This approach provides a flexible, principled way to reconstruct time-varying multivariate categorical distributions from incomplete marginal data, expanding the set of auxiliary variables usable in longitudinal survey weighting and other population-inference tasks.

data
Find on Google Scholar
Find on JSTOR
Find on CUP
Political Analysis
Podcast host Ryan