Lost in Translation? New Algorithm Fixes Geolocation Accuracy

Geocoded Data Machine Learning News Media Text Analysis Methodology @PSR&M 1 R file 17 datasets Dataverse

This study introduces a novel two-stage supervised machine-learning algorithm designed to improve geolocation accuracy in event data.

### Data & Methods ###

* Extracts contextual information from texts including N-gram patterns for location words, their mention frequency, and surrounding sentence context.

* Uses training datasets (customized from news articles globally) to estimate model parameters.

* Employs the trained model on test data to predict if a location word correctly represents an event's actual place.

### Key Findings ###

* The algorithm successfully identifies inaccuracies in location mentions by analyzing surrounding text.

* It demonstrates superior performance compared to existing geocoders, even when processing unseen news articles.

### Why This Matters for Political Science ###

* Accurate event geolocation is crucial for tracking political phenomena and trends across countries.

* This approach provides a reliable method for enhancing the precision of automated text analysis in political research.

Article card for article: Lost in Space: Geolocation in Event Data

Lost in Space: Geolocation in Event Data was authored by Sophie J. Lee, Howard Liu and Michael D. Ward. It was published by Cambridge in PSR&M in 2019.