revolutionary tools.  groundbreaking articles.  proven results.

Forever Young: Age is a Significant Predictor of NFL Draft Position


Do NFL GMs read RotoViz?

Here at RotoViz, we tend to think that age is an important factor to consider when predicting a prospect’s potential as a professional (somanylinks), and other writers seem to share that view. It doesn’t appear that there are any football people who outright claim that “age doesn’t matter!” though there are a few who seem ambivalent about the importance of age versus other measurable or intangible characteristics. But in general, most NFL analysts agree that age might mean something.

But do NFL teams actually think like analysts, and factor age into their draft decisions? It isn’t always clear how NFL teams value youth in the draft. Publicly available scouting reports invariably list prospects’ birthdays, but unless a prospect is especially young or old, his age is rarely discussed as a major talking point in most pre-draft media coverage.

Most of us won’t get the chance to be in an NFL team’s draft war room, so we may never know for sure how much GMs and coaches consider a prospect’s age when deciding whether or not to draft him.

A little regression never hurt anybody

Fortunately, we have a tremendous amount of data that can give us an idea if age plays a part in draft decision making. And we have a fine tool, multiple regression modeling, to help us make sense of that data.1

Take a look at this figure, plotting average draft position by a prospect’s age for offensive skill positions, for players who were active for at least one game from 2000 to 2013.2 The age is calculated down to the day, but for the sake of clarity I’m just plotting full years on the x-axis. Each dot represents a player, color-coded by position. The dashed line is a quadratic regression line, and the gray surrounding the line is the confidence of fit (narrower gray area means a better fit).draft position by draft age

The dots may look noisy to our feeble human eyes, but to statistical modeling software, the linear relationship between age and draft position is strong and significant: on average, younger players get drafted before older players. The line is curved because I told it to be curved,3 which helped to uncover a significant quadratic relationship between draft age and position: as the age of a prospect increases, the effect of age on draft positions get smaller.

If you’re the kind of person who likes to read regression reports, this is your lucky day. To orient you to my formatting choices: on top of the table are a list of the model’s predictors and the model’s R2, which can be interpreted as a measure of how good a job our predictors are doing;4 the estimate column tells you the direction of the effect (in this case, negative estimates means earlier draft picks);5 the unique error reduced and the F6 columns tell you how much each predictor affects your model’s performance (bigger values, bigger effects); and the p-value is the probability that we’d get a F of this size if there weren’t really an effect in the population.

Model1: Draft position by Age, Age2; R2=.089

Effect Estimate Unique Error Reduced F p
Age 8.052 8.85% 126.255 <.0001
Age2 -28.412 0.75% 10.637 0.0011

We’re going to need a bigger model…

Of course, a model predicting draft position by age doesn’t account for all of the other factors that NFL teams care about when picking players – an NFL GM who based his drafts purely on prospect age probably wouldn’t be a GM for very long. As I argued recently, we should use all the relevant information available to us when forecasting players. Plus, the model of draft position by age is underfit7 – the predictors don’t follow the data very well, primarily because the data are that amorphous blob of dots from the beginning of the post. 

So let’s see if the effect of draft age holds up against some other predictors, which will also help to control for some of the weirdness in the shape of the data. We’ll do this by adding small sets of predictors to our model (that already includes age), and see how the overall accuracy of the model, and the proportion of error reduced by age, changes with each addition.

Height and weight are a pretty easy place to start, since we have that information on all of the players included in this dataset. I’m also including a height*weight interaction, in case NFL teams are especially partial to young players that are tall and heavy, or something along those lines. Here’s your table:

Model2: Draft position by Age, Age2, Height, Weight, Height*Weight; R2=.114

Effect Estimate Unique Error Reduced F p
Age 8.444 9.59% 140.296 <.0001
Age2 -27.881 0.71% 10.456 0.0013
Height -6.017 2.33% 34.112 <.0001
Weight 1.111 0.76% 11.194 0.0008
Height*Weight 13.114 0.22% 3.260 0.0712

Age is holding strong! And based on the F-value, age does a much better job of predicting draft position than height or weight. Taller guys get drafted earlier than their shorter peers, and heavier players are drafted later than light ones. The direction of the interaction suggests that NFL teams don’t favor skill players who are especially tall for their weight or especially heavy for their height.

Next, let’s add another common metric, 40yd dash time. I’m also going to test the interaction of 40 times with height and weight, as it could be that age doesn’t matter as much for guys who are much bigger and faster than their peers.

Model3: Draft position by Age, Age2, Height, Weight, Height*Weight, 40yd dash, 40*Height, 40*Weight, 40*Height*Weight; R2=.160

Effect Estimate Unique Error Reduced F p
Age 6.297 5.69% 58.293 <.0001
Age2 -24.001 0.66% 6.746 0.0096
Height -9.096 5.08% 52.042 <.0001
Weight -0.393 0.07% 0.707 0.4006
Height*Weight 20.807 0.41% 4.238 0.0398
40yd Dash 4.889 1.48% 15.175 0.0001
40*Height -174.717 2.47% 25.312 <.0001
40*Weight 45.976 1.75% 17.989 <.0001
40*Height*Weight 864.250 1.20% 12.279 0.0005

Now age has some competition, but it still rules the roost in this model. Once we control for 40yd dash, height becomes a much bigger deal – as prospects get taller, they get picked earlier. The effect of weight has gone away entirely in this model, and the height*weight interaction is still in the same direction as the previous model. 40yd dash time is a significant predictor, but it isn’t as strong as it’s interactions with height and weight: as players get taller, the relationship between 40 time and draft position gets smaller (teams value height even if it doesn’t come with speed), and as players get heavier, the relationship between 40 time and draft position gets bigger (if you’re heavy, you better be fast).

Enough of this continuous nonsense

Up until now, all of the predictors have been continuous variables: this last model adds two categorical (i.e. not a number) predictors to our model. The first is whether a prospect came from a college in a conference with an automatic BCS bid. This is a rough estimate of quality of college instruction – presumably, BCS schools can afford to pay for better coaches, who are not only better at scouting and recruiting talented high school players, but are better at teaching their players how to become better football players.

The second predictor is a player’s position. Different position groups carry different value, which is why it’s a lot more common to see WRs and QBs drafted earlier than RBs. Based on my limited knowledge of the draft process, I expect that these additional predictors will dwarf the others in terms of how much error they reduce, but let’s find out:

Model4: Draft position by Age, Age2, Height, Weight, Height*Weight, 40yd dash, 40*Height, 40*Weight, 40*Height*Weight, AutoBCS, Position; R2=.272

Effect Estimate Unique Error Reduced F p
Age 5.866 4.78% 56.259 <.0001
Age2 -23.852 0.64% 7.576 0.006
Height -4.305 0.56% 6.539 0.0107
Weight -2.781 1.46% 17.194 <.0001
Height*Weight -14.654 0.14% 1.692 0.1937
40yd Dash 10.906 5.58% 65.614 <.0001
40*Height -41.866 0.12% 1.382 0.2401
40*Weight 9.829 0.07% 0.793 0.3736
40*Height*Weight 232.648 0.08% 0.946 0.3311
AutoBCS -0.181 2.04% 24.021 <.0001
Position (QB vs. others) -1.256 8.38% 98.565 <.0001

Finally, age has been knocked from its perch as the strongest predictor in our model. Now, the title goes to position, reducing 8.4% of the error, by virtue of QBs being drafted significantly earlier than other positions. The next best predictor is 40yd dash time, with a 5.6% reduction in error. The age variables combined to reduce about 5.4% of the error, still good for third place. Not bad, age!

This plot is a summary of the models’ performance along with the error reduced by age and other predictors. As the models got better, age accounted for smaller chunks of the overall improvement, but it still played a bigger role than height, weight, and even if the player attended a BCS school.

errors reduced across models

On and on and on…

At this point, it seems clear that age plays a part in where a prospect is drafted, but this was only part of the story. These models looked at a small subset of possible predictors. There are countless additional factors that should be considered, most importantly college performance, and I’ll touch on them in the future as my database expands. The brain trust at RotoViz maintains a big database of college WR statistics, so that might be a good place to start.

Beyond that, there are many other questions that need to be answered about the effect of age on performance in the NFL. These analyses suggest that age has a significant effect on when a prospect is drafted. But do prospects who enter the league at a younger age have better outcomes? Are there differences across position groups? Can some coaches get more out of young players or veterans than other coaches?

We’ve got the data to answer all of these questions, and plenty more. Check back soon.

  1. For all of the statistical analyses, continuous variables were log-transformed (actually, natural log-transformed), to ensure that they all roughly followed a normal distribution.  (back)
  2. The vast majority of this data came from, and the rest came from  (back)
  3. By this I mean I selected the curvilinear line from my software’s “line of fit” options. You can do this with Excel too.  (back)
  4.  an R2=1 means that your model perfectly explains your data, and an R2=0 means that your model doesn’t explain your data at all.  (back)
  5. Except for the interaction terms, in which case, the estimate means that are you go from low to high values on one of the terms in the interaction, the relationship between the other term and your dependent variable gets larger (positive interaction) or smaller (negative interaction).  (back)
  6. Yes, this is the same F as you see in omnibus ANOVA tests; no, I’m not using it wrong; some people prefer to use a t-statistic here, but they’re exactly the same thing.  (back)
  7. F(816,1300)=1.230, p=.0060, but you’re a terrible nerd if you were curious about this. /fistbump  (back)

recent and related...

in case you missed it...

Dynasty Fallout From the End of the Philip Rivers Era

Last week, the quarterback taken by the San Diego Chargers with the first overall selection in the 2004 NFL Draft announced his retirement. On Monday, Jay Glazer reported that the Chargers were set to move on from the QB they got when they traded the first to the New York

Read More

Dynasty Trade Targets 2020 – Running Back Edition

In Dynasty Trade Targets 2020 – Running Back Edition, Curtis Patrick gives away the names of three running backs he’s buying right now. As rookie fever sets in across the dynasty community, the offseason trade window also begins to crack open. I’m sharing three of my top dynasty trade targets

Read More

Why You Should Buy High on Deebo Samuel

Roster management has quickly become one of my favorite aspects of dynasty football. The concept of running a team year-round, crafting it and tweaking it to my liking, and scoring big on off-season trades is just as fun as the in-season thrill of weekly matchups. I am working on being

Read More

Sign-up today for our free Premium Email subscription!

© 2019 RotoViz. All rights Reserved.