FIND DATA: By Author | Journal | Sites   ANALYZE DATA: Help with R | SPSS | Stata | Excel   WHAT'S NEW? US Politics | Int'l Relations | Law & Courts
   FIND DATA: By Author | Journal | Sites   WHAT'S NEW? US Politics | IR | Law & Courts
If this link is broken, please report as broken. You can also submit updates (will be reviewed).
Insights from the Field

Train Once, Classify Across: Party Platforms Predict Speech Topics


cross-domain
topic classification
supervised learning
party platforms
parliamentary speeches
Methodology
Pol. An.
2 archives
Dataverse
Cross-Domain Topic Classification for Political Texts was authored by Moritz Osnabrügge, Elliott Ash and Massimo Morelli. It was published by Cambridge in Pol. An. in 2023.

Supervised cross-domain topic classification trains a model on a labeled source corpus and applies it to an unlabeled target corpus from a different domain. This approach leverages existing labeled data to reduce effort compared with collecting new within-domain training data, while offering clearer, research-targeted topics than unsupervised methods.

🔍 The Approach

  • An algorithm is trained to classify topics in a labeled source corpus and then extrapolates those topic labels to documents in an unlabeled target corpus from another domain.

📚 Data Used: Party Platforms to Parliamentary Speeches

  • Source corpus: labeled party platforms.
  • Target corpus: unlabeled parliamentary speeches.

🧪 How Performance Was Evaluated

  • Standard within-domain error metrics were reported.
  • Cross-domain performance received additional validation by manually labeling a subset of the target-corpus documents to compare against classifier assignments.

📈 Key Findings

  • The classifier can accurately assign topics in parliamentary speeches.
  • Accuracy varies substantially by topic, indicating some topics transfer better across domains than others.
  • Using existing labeled data makes this method substantially more efficient than training new within-domain supervised models.
  • Compared with unsupervised topic models, the supervised cross-domain method can be more precisely targeted to a research question and yields topics that are easier to validate and interpret.

⚙️ Tools and Applications

  • Diagnostic tools are proposed to evaluate when cross-domain classification will perform well and to identify problematic topics.
  • Two case studies illustrate substantive use: how electoral rules and the gender of parliamentarians influence the choice of speech topics.

💡 Why It Matters

  • Enables reuse of labeled resources to extend topic measurement across domains, saving time and improving interpretability.
  • Provides a practical workflow and diagnostics for researchers studying political texts across different institutional contexts.
data
Find on Google Scholar
Find on JSTOR
Find on CUP
Political Analysis
Podcast host Ryan