{"id":216,"date":"2025-07-09T23:30:31","date_gmt":"2025-07-09T23:30:31","guid":{"rendered":"https:\/\/www.poliscidata.com\/blog\/?p=216"},"modified":"2025-07-09T23:33:11","modified_gmt":"2025-07-09T23:33:11","slug":"modeling-the-nfl-betting-market","status":"publish","type":"post","link":"https:\/\/www.poliscidata.com\/blog\/modeling-the-nfl-betting-market\/","title":{"rendered":"Modeling the NFL Betting Market"},"content":{"rendered":"\n<p>Looking ahead to the 2025\/26 NFL season, I analyzed DraftKings betting lines to understand how point spreads and over\/under totals reflect expectations about team performance. Using data from each week&#8217;s published lines, I estimated statistical models to recover implied team strength and scoring profiles directly from the market itself.<\/p>\n\n\n\n<p>To create the dataset for this analysis, I compiled weekly NFL betting lines for the 2025\u201326 season. For each game, I extracted the point spread, over\/under total, and team identities from the JSON data behind the sportsbook\u2019s odds tables. I processed this data in R, retaining one record per team per game and pairing home and away teams to calculate spreads and totals. By standardizing team names and focusing on consistent sources (DraftKings lines), I assembled a clean, structured dataset that allowed me to estimate team ratings and scoring tendencies directly from the market\u2019s expectations.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Modeling Point Spreads<\/strong><\/h4>\n\n\n\n<p>We began by modeling the point spread between two teams as: <\/p>\n\n\n\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"block\">\n  <mrow>\n    <msub><mi>Point Spread<\/mi><\/msub>\n    <mo>=<\/mo>\n    <msub><mi>R<\/mi><mi>home<\/mi><\/msub>\n    <mo>&#x2212;<\/mo> <!-- minus -->\n    <msub><mi>R<\/mi><mi>road<\/mi><\/msub>\n    <mo>+<\/mo>\n    <mi>HFA<\/mi>\n    <mo>+<\/mo>\n    <mi>&#x03B5;<\/mi> <!-- epsilon -->\n  <\/mrow>\n<\/math>\n\n\n\n<p>Where<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Point Spead is the number of points the home team is favored by (negative if home team is underdog),<\/li>\n\n\n\n<li>R<sub>home<\/sub> and R<sub>road<\/sub> are latent ratings for each team,<\/li>\n\n\n\n<li>HFA: home-field advantage (estimated to be around 1.5 points),<\/li>\n\n\n\n<li>\u03b5 is an error term<\/li>\n<\/ul>\n\n\n\n<p>By regressing observed spreads on team indicators (while imposing a constraint that the average team rating is 0), we recovered each team&#8217;s market-implied rating. These ratings can be interpreted as the number of points a team is expected to be favored over an average opponent on a neutral field. In any game, the point spread in favor of the home team is equal to the home team&#8217;s rating, minus the road team&#8217;s rating, plus 1.5 points. The model offers a near-perfect explanation for point spreads (R<sup>2<\/sup> = .98).<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Modeling Over\/Under Totals<\/strong><\/h4>\n\n\n\n<p>Next, we modeled the over\/under line (expected total points scored in a game) using: <\/p>\n\n\n\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"block\">\n  <mrow>\n    \n    <msub><mi>OU<\/mi><mi>i<\/mi><mi>j<\/mi><\/msub>\n    <mo>=<\/mo>\n    <msub><mi>T<\/mi><mi>i<\/mi><\/msub>\n    <mo>+<\/mo>\n    <msub><mi>T<\/mi><mi>j<\/mi><\/msub>\n    <mo>+<\/mo>\n    <mi>&#x03BC;<\/mi> <!-- mu -->\n    <mo>+<\/mo>\n    <mi>&#x03B5;<\/mi> <!-- epsilon -->\n  <\/mrow>\n<\/math>\n\n\n\n<p>Where<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OU<sub>ij<\/sub> is the over\/under line in a game with teams i and j,<\/li>\n\n\n\n<li>T<sub>i<\/sub> and Tj are team-specific effects on the expected point total,<\/li>\n\n\n\n<li>\u03bc is the league-wide average expected points (estimated to be 45.7)<\/li>\n\n\n\n<li>\u03b5: residual error<\/li>\n<\/ul>\n\n\n\n<p>This symmetric model assumes each team contributes additively to the total expected points \u2014 without needing a home-field term \u2014 and fit the market lines remarkably well (R\u00b2 &gt; 0.99). <\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Visualizing Team-Level Effects<\/strong><\/h4>\n\n\n\n<p>We visualized these team effects using a one-dimensional plot with team logos scaled and positioned according to their estimated ratings or scoring impact. These visuals make it easy to grasp, at a glance, which teams the market considers strongest and which tend to produce high- or low-scoring games.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"591\" src=\"https:\/\/www.poliscidata.com\/blog\/wp-content\/uploads\/2025\/07\/NFL-Team-Forecast-1024x591.png\" alt=\"\" class=\"wp-image-219\" srcset=\"https:\/\/www.poliscidata.com\/blog\/wp-content\/uploads\/2025\/07\/NFL-Team-Forecast-1024x591.png 1024w, https:\/\/www.poliscidata.com\/blog\/wp-content\/uploads\/2025\/07\/NFL-Team-Forecast-300x173.png 300w, https:\/\/www.poliscidata.com\/blog\/wp-content\/uploads\/2025\/07\/NFL-Team-Forecast-768x443.png 768w, https:\/\/www.poliscidata.com\/blog\/wp-content\/uploads\/2025\/07\/NFL-Team-Forecast-1536x886.png 1536w, https:\/\/www.poliscidata.com\/blog\/wp-content\/uploads\/2025\/07\/NFL-Team-Forecast.png 1950w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>To illustrate, consider a game between the Detroit Lions (R = 3.7, T = 2.2) and Chicago Bears (R = 0.2, T = 0.5). If Detroit is the home team, the Lions will be favored by 3.7 &#8211; 0.2 + 1.5 = 5 points. If Chicago is the home team, the Bears will be 0.2 &#8211; 3.7 + 1.5  = -2 point underdogs. At either stadium, the over\/under line is 2.2 + 0.5 + 45.7 \u2248 48.5 points. In this manner, one may obtain point spreads and over\/under lines for the entire NFL season.<\/p>\n\n\n\n<p>The most striking feature of the betting line models uncovered in this analysis is their simplicity. The point spread and over\/under values, months before the season begins, are simply additive models based on the two teams involved. None of the factors that commentators and fans talk about &#8212; winter weather, rest after bye weeks, night v day games, travel distances, division rivalries, etc. &#8212; play any role in the betting lines. <\/p>\n\n\n\n<p>As the season unfolds and the markets obtain more information about teams, I would assume that the basic models remain the same with some adjustment of team-specific factors. For example, if Detroit loses a key player from its offense, its R should be adjusted downward, and possible its T as it may be expected to score fewer points.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Parity and Prediction<\/strong><\/h4>\n\n\n\n<p>A key insight involves translating point spreads into win probabilities. If a team is favored to win by 5 points, what is the probability that they will win? The win probability is a function of the point spread, taking into account the underlying variability of NFL margins of victory. <\/p>\n\n\n\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"block\">\n  <mrow>\n    <mi>P<\/mi>\n    <mo>(<\/mo>\n    <mi>Win<\/mi>\n    <mo>)<\/mo>\n    <mo>=<\/mo>\n    <mi>&#x03A6;<\/mi> <!-- capital Phi -->\n    <mo>&#x2061;<\/mo> <!-- function application -->\n    <mo>(<\/mo>\n    <mfrac>\n      <mi>Point Spread<\/mi>\n      <mi>&#x03C3;<\/mi> <!-- sigma -->\n    <\/mfrac>\n    <mo>)<\/mo>\n  <\/mrow>\n<\/math>\n\n\n\n<p>Where<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>P(Win) is the probability of the home team winning,<\/li>\n\n\n\n<li>Point Spead is the number of points the home team is favored by (negative if home team is underdog),<\/li>\n\n\n\n<li>\u03c3 is the observed standard deviation of margins of victory, \u224813.45,<\/li>\n\n\n\n<li>\u03a6 is the cumulative density function for the normal distribution.<\/li>\n<\/ul>\n\n\n\n<p>In practice, the relationship between spread and win percentage is nearly linear for point spreads between \u201314 and +14. Referring to the figure, one can see that few NFL games have extreme point spreads. When the Jets visit the Bills in Week 18, the Bills are favored to win by 5.2 &#8211; (-4.7) + 1.5 \u2248 11.5 points, an extreme point spread by NFL standards. A widely used approximation that follows the formula above is: <\/p>\n\n\n\n<math xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"block\">\n  <mrow>\n    <mi>Win<\/mi>\n    <mo>%<\/mo>\n    <mo>&#x2248;<\/mo> <!-- approximately equal -->\n    <mn>50<\/mn>\n    <mo>+<\/mo>\n    <mn>3<\/mn>\n    <mo>&#x00D7;<\/mo> <!-- multiplication sign -->\n    <mi>Spread<\/mi>\n  <\/mrow>\n<\/math>\n\n\n\n<p>This implies that a team favored by 1 point has roughly a 53% chance to win, a heuristic that aligns closely with historical outcomes. When the Bills are favored by 11.5 points, possibly the most extreme case, their win probability is 84.5%. In a more typical situation, Bears -2 point underdogs, there is a 44% win probability. Team R and T values represent tendencies in a game characterized by random forces. <\/p>\n\n\n\n<p>It is also interesting to note that there is more variability in R values than T values and some weak positive correlation between the two values. The teams expected to have &#8220;high scoring games&#8221; have T values around 2 and the team expected to have the lowest scoring games, Cleveland, has T = -3. You can see some difference between teams expected to succeed from strong defense (e.g. Philadelphia and Kansas City) and those expected to succeed with more scoring (e.g. Baltimore and Buffalo), but the expected difference is less than a field goal. <\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>Further Reading<\/strong><\/h4>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Stern, H. (1991). <em>On the Probability of Winning a Football Game<\/em>. Journal of the American Statistical Association, 86(415), 816\u2013823. <a>JSTOR Link<\/a><\/li>\n\n\n\n<li>Levitt, S. D. (2004). <em>Why Are Gambling Markets Organised So Differently from Financial Markets?<\/em> The Economic Journal, 114(495), 223\u2013246. <a>Wiley Link<\/a><\/li>\n\n\n\n<li>Harville, D. A. (1980). <em>Predictions for National Football League Games via Linear-Model Methodology<\/em>. Journal of the American Statistical Association, 75(371), 516\u2013524.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>By reverse-engineering betting lines, we can see how sportsbooks distill expectations into numbers \u2014 and how much information is already priced into the market. These results suggest that a large portion of what oddsmakers do can be uncovered with simple, elegant statistical tools.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Looking ahead to the 2025\/26 NFL season, I analyzed DraftKings betting lines to understand how point spreads and over\/under totals reflect expectations about team performance. Using data from each week&#8217;s [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":223,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-216","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.poliscidata.com\/blog\/wp-json\/wp\/v2\/posts\/216","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.poliscidata.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.poliscidata.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.poliscidata.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.poliscidata.com\/blog\/wp-json\/wp\/v2\/comments?post=216"}],"version-history":[{"count":6,"href":"https:\/\/www.poliscidata.com\/blog\/wp-json\/wp\/v2\/posts\/216\/revisions"}],"predecessor-version":[{"id":225,"href":"https:\/\/www.poliscidata.com\/blog\/wp-json\/wp\/v2\/posts\/216\/revisions\/225"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.poliscidata.com\/blog\/wp-json\/wp\/v2\/media\/223"}],"wp:attachment":[{"href":"https:\/\/www.poliscidata.com\/blog\/wp-json\/wp\/v2\/media?parent=216"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.poliscidata.com\/blog\/wp-json\/wp\/v2\/categories?post=216"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.poliscidata.com\/blog\/wp-json\/wp\/v2\/tags?post=216"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}