Read Political Science Articles with Replication Data

Insights from the Field

Train Once, Classify Across: Party Platforms Predict Speech Topics

cross-domain

topic classification

supervised learning

party platforms

parliamentary speeches

Cross-Domain Topic Classification for Political Texts was authored by Moritz OsnabrÃ¼gge, Elliott Ash and Massimo Morelli. It was published by Cambridge in Pol. An. in 2023.

Supervised cross-domain topic classification trains a model on a labeled source corpus and applies it to an unlabeled target corpus from a different domain. This approach leverages existing labeled data to reduce effort compared with collecting new within-domain training data, while offering clearer, research-targeted topics than unsupervised methods.

🔍 The Approach

An algorithm is trained to classify topics in a labeled source corpus and then extrapolates those topic labels to documents in an unlabeled target corpus from another domain.

📚 Data Used: Party Platforms to Parliamentary Speeches

Source corpus: labeled party platforms.
Target corpus: unlabeled parliamentary speeches.

🧪 How Performance Was Evaluated

Standard within-domain error metrics were reported.
Cross-domain performance received additional validation by manually labeling a subset of the target-corpus documents to compare against classifier assignments.

📈 Key Findings

The classifier can accurately assign topics in parliamentary speeches.
Accuracy varies substantially by topic, indicating some topics transfer better across domains than others.
Using existing labeled data makes this method substantially more efficient than training new within-domain supervised models.
Compared with unsupervised topic models, the supervised cross-domain method can be more precisely targeted to a research question and yields topics that are easier to validate and interpret.

⚙️ Tools and Applications

Diagnostic tools are proposed to evaluate when cross-domain classification will perform well and to identify problematic topics.
Two case studies illustrate substantive use: how electoral rules and the gender of parliamentarians influence the choice of speech topics.

💡 Why It Matters

Enables reuse of labeled resources to extend topic measurement across domains, saving time and improving interpretability.
Provides a practical workflow and diagnostics for researchers studying political texts across different institutional contexts.