
🔎 What Was Compared:
This study compares human- and machine-geocoded records of human rights violations in Colombia against an independent ground-truth source to assess external validity. Agreement rates between the two geocoding approaches are evaluated for an eight-year focal period, for three consecutive two-year subperiods, and for a selected set of (non)journalistically remote municipalities.
📊 How The Data Were Tested:
🧮 How The Models Compared Predictive Performance:
Spatial probit models were estimated separately on each of the three datasets to compare predictive patterns. These models incorporate Gaussian Markov Random Field (GMRF) error processes, are constructed via a stochastic partial differential equation (SPDE) approach, and are estimated using integrated nested Laplace approximation (INLA). The models test whether datasets:
🔑 Key Findings:
🌍 Why It Matters:
These results caution researchers and practitioners: machine-geocoded event data can be externally valid at the subnational level, but spatially structured prediction errors may affect inference and mapping of conflict risk. Choosing between human and machine geocoding should consider not only agreement with ground truth but also how geocoding method shapes spatial error patterns and subsequent model-based predictions.

| Human Rights Violations in Space: Assessing the External Validity of Machine Geo-coded vs. Human Geo-coded Data was authored by Logan Stundal, Benjamin Bagozzi, John Freeman and Jennifer Holmes. It was published by Cambridge in Pol. An. in 2022. |
