FIND DATA: By Author | Journal | Sites   ANALYZE DATA: Help with R | SPSS | Stata | Excel   WHAT'S NEW? US Politics | Int'l Relations | Law & Courts
   FIND DATA: By Author | Journal | Sites   WHAT'S NEW? US Politics | IR | Law & Courts
If this link is broken, please report as broken. You can also submit updates (will be reviewed).
Predicting Race from Voter Rolls Cuts Bias in Turnout Estimates
Insights from the Field
ecological inference
voter registration
Bayesian prediction
surname list
turnout
Methodology
Pol. An.
30 R files
2 Text
Dataverse
Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records was authored by Kabir Khanna and Kosuke Imai. It was published by Cambridge in Pol. An. in 2016.

🔎 The problem addressed

Ecological inference—inferring turnout and vote choice for racial groups from aggregate election returns and neighborhood racial composition—can generate aggregation bias that distorts race-specific turnout and vote-share estimates. A different strategy is to predict individual-level ethnicity from voter registration records and then aggregate those predictions.

đź§ľ How ethnicity was predicted

  • Bayes's rule was used to combine the Census Bureau's Surname List with information drawn from geocoded voter registration records.
  • The approach produces probabilistic ethnicity assignments for individual registrants by merging surname-based priors with location-based evidence.

📊 How the method was evaluated

  • Validation used approximately nine million Florida voter registration records where self-reported ethnicity is available.
  • Predicted ethnicities were compared to self-reports to measure true and false positive rates, and predictions were used to estimate turnout by race for comparison with standard ecological inference estimates.

âś… Key findings

  • False positive rates were reduced to 6% for Black voters and 3% for Latino voters.
  • True positive rates remained above 80% across groups.
  • Turnout-by-race estimates derived from the predictions exhibited substantially lower bias and root mean squared error than standard ecological inference estimates.

🛠️ Practical output

Open-source software is provided to implement the proposed methodology, enabling replication and application to other voter files.

đź’ˇ Why it matters

Predicting individual ethnicity from voter registration records offers a practical, data-driven alternative to traditional ecological inference, reducing aggregation bias in race-specific turnout estimates with clear applications for political behavior research and voting-rights litigation.

data
Find on Google Scholar
Find on JSTOR
Find on CUP
Political Analysis
Podcast host Ryan