FIND DATA: By Journal | Sites   ANALYZE DATA: Help with R | SPSS | Stata | Excel   WHAT'S NEW? US Politics | IR | Law & Courts🎵
   FIND DATA: By Journal | Sites   WHAT'S NEW? US Politics | IR | Law & Courts🎵
WHAT'S NEW? US Politics | IR | Law & Courts🎵
If this link is broken, please report as broken. You can also submit updates (will be reviewed).

Machine Coded Data? Not Always Better Than Human Coding for Underreporting Bias

underreporting biasstate repression eventsAgence France-PresseAssociated Pressmachine codingMethodology@PSR&MDataverse
Methodology subfield banner

This research investigates a common problem in textual political science data: underreporting bias. News sources often fail to report state repression events, similar issues can occur with human coders.

Using the Agence France-Presse and Associated Press news datasets as examples, Cook et al.'s method estimates the extent of unreported repression by comparing multiple sources' coverage.

Researchers applied this technique using machine-coded data from the World-Integrated Crisis Early Warning System dataset. Both models (human vs. machine coding) were then evaluated against external measures of human rights protections in Africa and Colombia.

The findings reveal that underreporting bias affects both forms of data collection similarly across different contexts like Colombia's political landscape.

This means researchers must actively account for potential missing information whether analyzing news reports or algorithmically coded texts.

Article card for article: The Prevalence and Severity of Underreporting Bias in Machine and Human Coded Data
The Prevalence and Severity of Underreporting Bias in Machine and Human Coded Data was authored by Benjamin Bagozzi, Patrick Brandt, John Freeman, Jennifer Holmes, Alisha Kim, Agustin Palao Mendizabal and Carly Potz-Nielsen. It was published by Cambridge in PSR&M in 2019.
Find on Google Scholar
Find on JSTOR
Find on CUP
Political Science Research & Methods
Edit article record marker