🔎 What This Paper Reconsiders
There is an active debate about whether human rights standards have shifted over the past 30 years. Existing evidence rests on indicators produced by human coders who read the texts of human rights reports. This work reframes that debate as a supervised learning problem: if coders apply consistent standards over time, the mapping from textual features to human-coded scores should be time‑constant; if meanings have changed, identical textual features will receive different numerical scores at different times.
🧭 Reframing the Measurement Problem
- Treating coding as a prediction task makes the implications of stable versus changing standards explicit.
- Under a time-constant standard, models trained on older reports should generalize to newer reports (and vice versa).
- If standards evolve, the same textual cues will be labeled differently at different times, producing systematic changes in model performance across time.
🛠️ How Algorithms Were Tested Over Time
- A wide variety of supervised learning algorithms were trained to map report text to human-coded scores.
- Models were trained on older subsets of observations and on newer subsets to compare out-of-sample accuracy patterns across time.
- The analysis recognizes that the mapping from natural language to numerical scores is complex; rather than assuming a simple rule, the approach uses aggregate patterns of predictive accuracy to distinguish the two data-generation processes.
📈 Key Results
- Training models on older versus newer observations produces divergent overall patterns of accuracy.
- These divergent accuracy patterns are consistent with the expectation that human rights standards, as reflected in coding decisions, have changed over time.
❗ Why This Matters
- The supervised learning perspective provides a novel, testable lens on debates about measurement change in human rights research.
- Findings imply caution when comparing human-coded indicators across decades and suggest that shifts in coder interpretation can be detected through changes in algorithmic performance.






