The NFL combine workouts start Friday in Indianapolis, and football fans across the nation will dissect every result. We can gather that being bigger, stronger, faster, quicker, and having jumping ability are desirable traits. But what do they mean for NFL success?
The concept that faster players are better is easy to grasp, but how do we evaluate the totality of drills against each other? Is a fast and less agile running back better than the opposite? Does the broad jump matter more for taller players? And how do all the drills work together to forecast NFL success?
I had been having trouble answering those questions for myself, and adding multiple measurables to linear or logistic regressions can muddle the results due to multicollinearity. I decided to build a different type of model, a regression tree, to look only at NFL combine results and categorize results by NFL success.
In this post, we’re going to look at running backs, and I’m defining success for a prospect as having at least one top-12, or RB1 season (PPR scoring) in his first three years.
Here’s how to read the regression tree nodes. The “yval”, or predicted value, in this case is the likelihood of success (from 0 to 1). The darker the node, the higher the yval.
I plugged into the regression equation the following NFL combine measurements for running back prospects from 2000-2013: height, weight, 40-yard dash, short shuttle, three cone, vertical, broad, and bench. The regression tree does the work of figuring out which variables are most important, and how we can classify different athletic profiles by their chance of NFL success.
Right off the top we see that 40-yard dash is the first and most important measurable. Those with 40-yard dash times under 4.5 have a 0.27 success rate; those at or above are only at 0.07. But there is hope for slower running backs, as long as their three cone time is below 6.8.
The success rate for slower backs with sub-6.8 three cone times is equal to the sub-4.5 backs in total. But only 11 of the 227 observations with a 4.5 or slower 40-yard dash had a sub-6.8 three come time. Doug Martin and Le’Veon Bell are two running backs that would fall into this node, boosted by extraordinary quickness for their size.
Next time you see an analysis that player X can be a productive NFL running back because they’re at least as athletic as Arian Foster, take it with a healthy dose of salt. While it’s possible for slower, less agile backs to become NFL studs, those with Foster’s athletic profile (the node furthest left) only project to find early NFL success roughly six percent of the time.
The first split on the sub-4.5 forty side is for height at 70 inches, or 5 feet 10 inches tall. It’s interesting that taller backs would be that much more successful (0.43 to 0.094), and it’s also noteworthy that weight was not one of the nodes. Remember, draft position is not part of the analysis, so some of the drills could be influencing draft position more than they are actually useful on the field.
Next is broad jump, with running backs over 124 inches (plus sub-4.5 40s and taller than 70 inches) having the best success scores of 0.78. Successful running backs like Arian Peterson, DeMarco Murray, Joseph Addai, Darren McFadden, Latavius Murray and Marshawn Lynch would qualify for this node.
For those with broad jumps less than 124 inches, we return to the 40-yard dash to differentiate. Those with an even faster time below 4.4 have a 0.57 success score, versus 0.21 for those between 4.4 and 4.49. Jamaal Charles would fall into this category of taller, ultra-fast running backs.
These regression trees are not meant to be the gospel, but instead give you an idea of how the different combine drills work together. Also notable is that you can identify 0.50-0.70 success ratio for running backs through combine measurables alone, without incorporating production or draft position. I’ll follow up soon with a tree for wide receivers, which takes a very different path to NFL success.