Sabermetrics, football, and possessions as a constraint

Possessions are like innings

I know this is ostensibly a fantasy football newsletter, but I have some real football takes now and then, and this seems like the best place to publish them. Maybe the only thing about social media I actually enjoy is the live commentary during football games. That encompasses a lot of elements, perhaps most notably how every decision coaches and teams make can be analyzed in real time.

The commentary about analytical models can at times be almost exhausting in its predictability, and there are many who are tired of hearing whether teams should be punting, kicking field goals, running on first-and-10, et al. It would probably be fair to suggest things have jumped the shark a bit, and after every vaguely controversial decision there are those who have now taken up to disagree with models, as is a natural next step in the evolution of this widely popular commentary.

I’ve certainly been on both sides here — I think it was in Stealing Signals this year that I wrote about how I thought Mike Vrabel’s decision to give up a first down with an intentional encroachment because of the clock and his timeouts wasn’t anything special, largely because I’d seen the move dozens of times in flag football leagues when I was younger. If a bunch of weekend warriors can easily come to the conclusion that instead of trying to stop a second-and-short three times, you should reset the downs because you only have enough time to get one stop, I’m not going to crown an NFL head coach for that pretty basic math.

Anyway, this post is essentially about the messaging of these discussions, and it’s not particularly novel and may have been written about elsewhere. We’ve had a couple key moments in these playoffs that have been easy for Twitter to universally pan, but obviously if coaches continue to make the “wrong” decisions, there’s a subset of individuals who struggle with how to think things through (including apparently those head coaches).

Wild Card weekend brought us Vrabel furthering my opinion that he’s not a secret genius of situational football, as he punted on a fourth-and-2 from the Baltimore 40-yard line with 10:06 remaining in a game he was trailing 17-13. The Ravens would drive down and kick a field goal, and Tennessee would only get the ball one more time. They were unable to even get back to the Baltimore 40 — that punt gave away what became their last best scoring chance in a one-possession game.

Another bad punt came from Mike Tomlin and the Steelers that same weekend. Down 28-0 after the first quarter and 35-10 by halftime, the Steelers had a huge deficit to overcome, but they did have some time to play with, owing to how quickly they dug their early-game hole. That seemed to lull them into a false sense of security, though.

Pittsburgh clawed to 35-23 by the end of the third quarter, and had earned the ball back facing a fourth-and-6 from their own 41-yard line. This one was perhaps a little less cut-and-dry because they were inside their own territory and had a full quarter to play, but ultimately they still trailed by 12 points to a team they were having trouble stopping (and the Browns were definitely not sitting back; they were still attacking). The Steelers punted on the first play of the fourth quarter for a touchback, then Cleveland went the length of the field in six plays for a touchdown that essentially put the game out of reach with 12:32 remaining.

Both of these decisions were widely criticized in real time, including by yours truly. And it would be true to point out that the opposition getting a score on the ensuing possession made them look like even worse decisions in retrospect, where a quick stop may have afforded a more favorable hindsight analysis. But even if they had gotten those stops, both of these coaches made poor decisions largely because of the number of remaining possessions they could expect their teams to get.

A baseball analogy because sabermetrics

The turn to baseball in a football discussion is typically where a post goes downhill. But there’s something interesting in the sabermetrics revolution, particularly in one of the key (now two-decade-old) “market inefficiencies” that the Oakland A’s identified and was written about in Moneyball.

One of the biggest things the early success of those A’s was based around was the prevention of outs. Instead of looking at batting average, the A’s cared a lot about walk rates, identifying that on-base percentage was a far better measure of a batter’s success or failure. There’s probably some quote I could pull that talks about how there are nine innings and 27 outs in a game, and that’s a very specific constraint during which you have to score your runs. As I recall, the A’s were also heavily against sacrifice bunting and also thought trying to steal bases was frequently not worth the risk of being thrown out. So much of it was about maximizing the scoring potential of each of the 27 outs they were given throughout a game, and not voluntarily settling for anything.

There is a pretty direct analogy in football to innings and outs — possessions and downs. In baseball, a sacrifice bunt might trade one out in an inning, but as I recall the math, in the right situation it can increase the chance for exactly one run in that inning. The issue the A’s took, as I recall Moneyball telling it (I read this book probably 15 years ago), is increasing the chance for exactly one run wasn’t worthwhile because the tradeoff of sacrificing an out was it decreased the chance for a big inning with, say, three or more runs. And since you never know how many runs you might need in a game — your opposition can always score the next time they are up to bat — playing for a single run is dubious. I think there’s some acknowledgement that there are specific situations in the late-game stages when things are tied that you would want to maximize your potential to score just one more run. But the idea is that in nearly all other situations, it’s a poor decision.

Anyway, let’s talk possessions. I initially heard about thinking of football games this way a few years ago in the betting context — I’m no professional bettor, but it’s my understanding many look at expected possessions in a game when they are weighing game totals and whether to bet overs or unders. In that type of bet, it doesn’t matter who wins, it’s just a calculation of the potential success rate of each drive and then how many expected drives each team will get.

I’ve referenced this some over the past few years, but I think it’s still a misunderstood number. Per Pro Football Reference, across the 2020 season, the average NFL team got 10.9 drives per game. That number was down a bit because offenses were more successful, and longer drives means fewer of them overall. In 2019, it was 11.2 per game.

I’m not sure what number of possessions I would have guessed teams got in a football game before starting to look at things this way, because football games are very rarely discussed on these terms. I do remember thinking that seemed low, and in the time since I’ve realized that, I’ve come to believe possessions are the biggest constraint NFL teams face.

While the number of possessions a team might get is not as rigid as nine innings or 27 outs in baseball, football teams can be pretty sure they will only get around 11 chances to score points in every game, setting aside potential defensive or special teams touchdowns that we know to be somewhat fluky and not something to count on. Leaguewide, teams score on roughly 35%-40% of offensive possessions, counting both touchdowns and field goals. The Packers led the league scoring on 49.7% of offensive possessions and the Jets were last at 26.3%.

So we can expect NFL teams will score on somewhere between a quarter and a half of all possessions. Of course, those numbers will vary in one-game scenarios. The league average teams scored 24.8 points per game in 2020, with the Packers leading at 31.8 and the Jets at the bottom at 15.2. For a league average team, flipping one or two possessions in any given game from scoring to not scoring (or vice versa) can mean the difference between breaking 30 points or failing to hit 20.

The point I’m trying to drive home here is every possession is extremely important. If there are roughly 11 per game, you’re throwing away about 9% of your total scoring chances in a game every time you don’t score. And because most key decisions take place in the later stages of the game when more about the score and situation are known, it might even be better to look at this as roughly 5.5 possessions per half, or 2-to-3 per quarter. Teams do tend to run more possessions late in games, because clock stoppages (including timeouts) are more likely, but any decision to voluntarily end one of your team’s own possessions in the fourth quarter has to be made considering the potential remaining possessions your team will get. And a key note when you, say, punt in the fourth quarter, is that the very next possession will be your opponent’s, meaning you are essentially hoping to have as many possessions as the opposition, but won’t have more. Depending where you are in the fourth quarter, you have to make a decision to punt with this mindset: “We’re going to get at least one more shot, maybe two.”

How bad were those Wild Card decisions?

If you’ve been following the discussions about these decisions, you’ve undoubtedly heard lines like “a punt is a turnover.” Thinking of possessions as the major constraint fits basically all of these types of comments — we know turnovers to be bad because they cut off a potential scoring opportunity, and while turnovers also frequently give the opposition good field position, most of these analytics discussions are essentially suggesting the difference in field position between a turnover and a punt is the overvalued element and the loss of a possession is the more important point.

The same line of thinking applies to teams settling for field goals — the sacrifice bunt of football — and even the decisions to run too frequently on early downs, which are a major negative to the goal of maximizing your potential to gain a first down. A bad situation-neutral early-down pass rate is akin to a team ignoring the value of on-base percentage — it’s a decision that increases the likelihood you’ll give up a down (or out) with an unsuccessful play (or at bat). It’s more difficult to score a run with nobody on base and one out, just as it’s more difficult to advance the football from second-and-9.

Again, I’m writing this mostly because it is, at least to me, a really easy way to explain why the decisions are bad in real time. So much of the whole “analytics as boogeyman” trope in football right now comes down to how these ideas are communicated.

In the case of the Titans, it’s fairly easy to discuss. Tennessee had about 10 minutes remaining, roughly one-sixth of the game, when they made the decision to punt in Baltimore territory. That means they almost certainly had no more than two additional possessions, but they were also turning the ball over to their opponent who suddenly had the potential to get the basketball equivalent of a two-for-one down the stretch (how many other sports I’m going to reference here is anyone’s guess).

That idea of a two-for-one brings in another major analytics discussion that I think sometimes goes underdiscussed because possessions aren’t viewed as the major constraint. Frequently, you’ll see sharp modelers focus on how teams can maximize their potential for the last possession of the first half and (obviously) the game. If we understand that possessions are the major constraint, it’s easy to understand the potential to run more of them than your opponent as a major win. Playing the end of half or game scenarios correctly to have the ball last is kind of like getting an extra inning to hit that your opponent doesn’t get. But of course that doesn’t really apply if you’re not actually trying to score at the end of the half — running out the clock is just wasting that extra scoring opportunity. Maximizing every edge you can get in terms of quantity and quality of possession is the key.

As we found out, the Titans got just one more possession after their decision to punt, which should have been considered a decent possibility, especially when you start layering in context like Baltimore being one of the top running teams in the NFL and Tennessee’s defense not being strong. And because the Titans couldn’t be sure Baltimore wouldn’t score again — they very well could have driven down and scored a touchdown to push it to an 11-point game — they needed to maximize the possession they had. If they were only going to get one more possession, and that possession had something close to a league average 35%-40% success rate, and they couldn’t be sure of preventing their opponent from scoring on their opportunities, giving it a go on fourth-and-2 at their opposition’s 40-yard line was an absolute no-brainer. The math of the remaining potential possessions is just way too thin otherwise.

The Steelers situation happened earlier, and really their possession math started as early as the 28-0 deficit at the end of the first quarter or the 35-10 halftime lead. If you’re down four scores, that doesn’t mean you need at least four remaining possessions — it means you need to outscore your opponent by at least four possessions in the game’s remaining opportunities. Sure, you can play dumb and hope that you will prevent your opposition from scoring a single point for the rest of the game, but that’s a tall ask. Even if you think a team that has scored a ton of points on you to that point will have a lower-than-league-average chance of scoring on all remaining possessions, the idea they wouldn’t score at least once is unlikely.

I’ll put some numbers to that. If you think you can hold them to a 25% chance of scoring on all remaining possessions — basically making them the Jets — it would mean they would score at least once across four possessions 68% of the time. Again, that’s assuming their offense is very bad, and it’s also only saying there are four possessions left, which would also require you to score on all four possessions, and not just field goals but touchdowns each time, to bring the game back to level. You succeeding in scoring a touchdown on four straight drives is a much less likely outcome, and getting both outcomes together is miniscule. Obviously when you’re down a ton, your odds are small regardless, but the idea is you try to maximize them.

The Steelers did do a good job of speeding up the game by throwing a ton, starting early in the game, and passing increases potential possessions because incompletions stop the clock and bigger plays are more likely, which can lead to shorter possessions. The Browns also stayed aggressive, unwilling to just sit on their lead (smartly, in my opinion), and by the end of this game Pittsburgh had run 14 possessions to Cleveland’s 13. That in itself was a win for the Steelers.

But despite a very high success rate down the stretch for Pittsburgh’s offense, they still lost by 11 points. Here’s their drive summary from the PFR box score.

And for reference, here’s Cleveland’s.

At the point the Steelers punted, they had held Cleveland to three straight punts to start the second half and had scored two touchdowns of their own. The math from that point on was very precarious — Pittsburgh seemed to believe the Browns wouldn’t score again the rest of the game, which was a poor bet, but also that they would score two more touchdowns themselves on what was likely to be, at most, three more possessions from the time of their punt.

Ultimately, the Steelers did score two more touchdowns from that point, part of which could have been a result of Cleveland extending their lead and playing softer defense, particularly as it relates to the eight-play, 77-yard touchdown drive in the final three minutes when the game was back to an 18-point margin. But that brings in another point — if you’re expecting your offense to play at an extremely high rate regardless, then you should be more willing to trust them to convert a fourth down.

Giving away a possession limited the remaining outcomes to where Pittsburgh needed a higher rate of stops and a higher rate of their own scores on the game’s remaining opportunities. Thinking in terms of possessions as the major constraint makes the fourth-down “go” decision a lot clearer in this situation as well. Thinking in terms of time remaining can skew things — regardless of how the clock shook out, Pittsburgh had almost no shot to get four additional drives, and while they technically only needed two, it’s a poor plan to hope for touchdowns on each of your possessions and zero points on each of your opponents’. There weren’t enough expected possessions left at that point of the game to justify voluntarily giving one up.

As I said earlier, I’m sure this line of thinking has been written about elsewhere. There are so many smart football people tackling these discussions, and I’m referring back to Moneyball which is very old and has been written about to death, so I can’t imagine this is a new line of thinking. Apologies to anyone whose work this might be behind.

But I do think this is a very important consideration for the discussion. It translates beyond the examples I’ve given here related to punt decisions, including whether trailing teams should be increasing their number of potential possessions by throwing a ton (like the Steelers did) or whether a field goal or going for a touchdown from in close is the best decision. It’s an easy shorthand, and I think if more of the discussion centered on potential remaining possessions, most of these decisions the analytics community either would support or criticize might be more easily understood by casual fans.