Do Judges and Juries See Cases Differently? Rethinking the Severity Gap

A well-documented finding in legal research is that judges tend to render more severe verdicts than juries do when deciding the same cases. The table below compiles the results of three studies comparing the decisions of judges and juries in the same cases (based on Kalven & Zeisel; Eisenberg et al.; Hannaford-Agor et al.).
Table. Comparison of Judge and Jury Decisions
Jury’s Decision | Totals | ||||
Not Guilty | Guilty | Deadlock | |||
Judge’s Decision | Not Guilty | 43.7% | 3.9% | 19.9% | 17.1% |
Guilty | 56.3% | 96.1% | 80.1% | 82.9% | |
Counts | 1,278 | 2,640 | 267 | 4,185 |
But while the pattern itself is relatively clear, the explanation for it remains contested.
One line of reasoning attributes the difference to the personal characteristics and training of judges. Unlike jurors—who typically have no legal experience and are instructed to focus narrowly on the facts of the case—judges are legal professionals. As such, they are believed to make decisions based on detached legal reasoning and to resist emotional appeals. Studies have suggested that judges are less likely than jurors to be influenced by extralegal factors such as the likeability of the defendant, the presence of children in the courtroom, or the emotional testimony of witnesses (e.g., Eisenberg et al.). Jurors, by contrast, are sometimes characterized as more prone to sympathy, compassion, and humanistic reasoning in deciding guilt or punishment.
However, this comparison between judges and juries may suffer from a basic category error. Judges are individuals; juries are groups. The basic comparison is apples to oranges. Decades of research in social psychology show that groups behave differently than individuals, especially under conditions of uncertainty or when facing emotionally charged decisions.
This raises an underexplored question: Do judges make decisions as if they are a “jury of one”? Or put differently, could the observed leniency of juries be less about the character of jurors and more about the structure of group decision-making?
To investigate this, I conducted a preliminary analysis comparing judge-made decisions to simulated “jury of one” verdicts—that is, imagining a trial where a single juror renders the verdict independently, without the benefits or pressures of group deliberation. The aim was to isolate how much of the severity shift might be explained by the procedural difference (individual vs. group decision) rather than any intrinsic difference in preferences or compassion.
In this analysis, P(g) represents the probability that a randomly selected fact finder would find the defendant guilty. P(G) is the probability of a guilty verdict. A jury of one can simply impose his or her preferred verdict: P(G | n=1) == P(g). When there are 12 jurors, the outcome is the result of a consensus-building deliberation process. P(G | n=12) is a function of P(g) graphed in the Figure below.
Figure. Comparing Verdict Probabilities with 1- and 12-Member Juries

For most of the range of possible P(g) values, a jury of one is more likely to convict then a jury of twelve. The “more severe” area is the product of institutional design, not differences in temperament. At high values of P(g), the jury of one is slightly more lenient.
My findings suggest that procedure matters. Some portion of the severity gap between judges and juries can be accounted for by the dynamics of group deliberation. It is hard to specify how much of the severity shift is explained by institutional design without making assumptions about the distribution of P(g) in criminal cases. However, the procedural difference does not fully explain the shift. Here’s why: There is no value of P(g) in the Figure above where the severity shift is large enough to produce the differences observed in Table 1.
At P(g) = .35, where the largest severity shift is observed, the probability of juror of one convicting when a jury of twelve acquits is .31. (There is .04 probability of both convicting, .57 probability of both acquitting, .31 probability of one convicting while twelve acquit, and .08 probability of twelve convicting while one acquits). In observed deliberations, judges convict in .56 of cases where juries acquit. Thus, even under the most favorable to the “jury of one” hypothesis, the observed severity shift cannot be entirely explained by institutional differences.
Judges still tend to deliver more severe verdicts than even solo jurors acting without group influence. This indicates that both design and disposition likely play a role: judicial training and temperament, on one hand, and group-based moderation and norm-shaping among jurors, on the other. Further research is needed to unpack these preliminary results. For now, the lesson is clear: verdict severity is not just a function of who decides, but how they decide.
References
- Eisenberg, T., Hannaford-Agor, P., Hans, V. P., Mott, N., Munsterman, G. T., & Wells, M. T. (2005). Judge-Jury Agreement in Criminal Cases: A Partial Replication of Kalven and Zeisel’s The American Jury. Journal of Empirical Legal Studies, 2(1), 171–206.
- Hannaford-Agor, P. et al., Are Hung Juries a Problem? (National Center for State Courts, 2002).
- Kalven, H. & Zeisel, H., The American Jury (Univ. of Chicago Press, 1966).