
🔎 What This Guide Does
This guide walks through the consequential decisions required before producing automated measures from news text, combining theoretical discussion with empirical tests. A running example—measuring the tone of New York Times coverage of the economy—illustrates how everyday choices reshape the data and the inferences researchers draw from it.
🧾 Running Example: Measuring NYT Economic Coverage
- Uses New York Times articles about the economy as the empirical case to demonstrate practical implications.
- Examines how different corpus construction and coding choices affect measures of tone.
🧭 How Choices Were Tested and Compared
- Both theoretical arguments and empirical comparisons are used to assess impacts of methodological decisions.
- Key dimensions evaluated include corpus selection, unit of analysis for coding, allocation of coding effort, and classification method (supervised algorithms versus dictionaries).
📌 Key Findings
- Two reasonable approaches to corpus selection can produce radically different corpora, changing downstream measures and conclusions.
- Keyword searches are recommended over predefined subject categories provided by news archives, because archive categories can yield inconsistent or misleading corpora.
- Coding article segments (larger text chunks) provides clear benefits compared to sentence-level coding.
- Given a fixed total number of codings, it is better to code more unique documents than to assign more coders per document.
- Supervised machine learning classifiers outperform dictionary-based approaches on multiple criteria.
💡 Practical Recommendations and Takeaway
- Prioritize careful corpus construction (favor keyword-based retrieval), choose segment-level units for coding when appropriate, allocate coding resources to increase document coverage, and favor supervised learning with human validation.
- Thoughtfulness and human validation remain essential; automated classification is easy to run but can mislead when methodological choices are unattended.