Read Political Science Articles with Replication Data

Insights from the Field

How CNNs Read Handwritten Vote Tallies for Social Science Research

CNNs

Image classification

Handwriting

Vote tallies

Machine learning

Learning to See: Convolutional Neural Networks for the Analysis of Social Science Data was authored by Francisco CantÃº and Michelle Torres. It was published by Cambridge in Pol. An. in 2022.

🔎 What this paper introduces

Convolutional neural networks (CNNs) are presented as a practical tool for classifying visual information in social science research. The method offers a way to speed up the tedious task of labeling images and extracting structured information from visual documents, making new data sources usable for analysis and policy work.

🛠️ How the technique was applied

Demonstrates implementation by coding handwritten information from vote tallies: images of tally sheets were processed and a CNN-based pipeline used to identify and classify handwritten entries.
Describes the functioning and implementation steps required to move from raw images to machine-readable data, including image preprocessing, model training, and post-processing to extract structured records.

📈 Key findings and impact

CNNs can substantially reduce manual labor involved in image classification and data extraction from visual records.
The approach unlocks previously underused visual data (e.g., hand-filled forms, tally sheets) for scholars and policy practitioners.

⚠️ Practical challenges and limitations

High variability in handwriting and image quality complicates classification.
Labeling training data is time-consuming and can be costly.
Models may struggle to generalize across contexts or exhibit biases tied to the training set.
Computational resources and expertise are required for reliable implementation.

✅ Advice for researchers and practitioners

Invest in careful labeling, quality control, and preprocessing to improve accuracy.
Use data augmentation and transfer learning to mitigate small-sample problems.
Validate models across diverse samples and report limitations transparently.
Balance automation gains against residual error and the need for human review in critical applications.

📌 Why it matters

The paper shows that CNNs make visual sources—such as handwritten vote tallies—more accessible for empirical inquiry, expanding the range of data available to political scientists and policy analysts while also outlining realistic constraints and solutions for applied use.