Most Users Don’t Demand Heavy Moderation Even When Speech Is Toxic

Insights from the Field

moderation

toxicity

incivility

experiments

social media

Toxic Speech and Limited Demand for Content Moderation on Social Media was authored by Franziska Pradel, Jan Zilinsky, Spyros Kosmidis and Yannis Theocharis. It was published by Cambridge in APSR in 2024.

🔎 What Was Asked and Why

This research investigates when speech on social media is judged toxic enough to deserve platform intervention. Platforms set rules and depend on user reports, but little is known about what everyday users consider unacceptable and what moderation actions they prefer.

🧾 How the evidence was collected

Two pre-registered randomized experiments tested causal effects on moderation preferences.
Study 1: N = 5,130; Study 2: N = 3,734.
The experiments manipulate three conceptually distinct forms of toxic speech: incivility, intolerance, and violent threats, and then measure users' preferences for content moderation.

📊 Key Findings

Severity of toxicity matters: more severe forms of speech produce stronger reactions.
Target of the attack matters: who is attacked affects moderation preferences.
Despite these effects, overall demand for content moderation of toxic speech is limited—many users do not endorse strong platform intervention even when exposed to toxic messages.
The three-category framework (incivility, intolerance, violent threats) helps explain variation in how users respond to different kinds of toxic content.

🔍 Why It Matters

These results have direct implications for platforms, policymakers, and democratic discourse. Understanding that users’ appetite for moderation is constrained—even when presented with toxic content—helps explain the challenges platforms face when relying on user reporting and highlights the political and policy trade-offs involved in designing moderation systems.