In the short amount of time that the CFB QB stat filter app has been up I’ve done a little bit of mining of the data to explore for predictive relationships. To offer some context, I should say that I think predicting QB success is an extremely difficult endeavor and I don’t know that things are going to get a lot better any time soon. The primary problems that you encounter when trying to predict QB success is that teammate effects at both the college and pro level are going to obscure the picture. Also, we have small samples to test against when it comes to QBs. Whereas NFL teams will often have 3 or 4 WRs on the field, there’s just 1 QB on the field per team at any time.
But I have been able to find one stat that has some relationship to NFL Yards/Attempt, and that is college Interception Rate on 2nd/3rd/4th down, with 8 or more yards to go (and facing a top 35th percentile defense). This isn’t an all-explanatory relationship by any means, but it had some predictive value so I thought I would post it. A good question to ask is what is this information good for? I would probably just look at the list and try to figure out what it might say about a guy that he was good or bad on this measure. For instance, are the guys at the top of the list only there because they’re ultra conservative? Are any of the guys at the bottom of the list there because they’re careless or inaccurate, or perhaps have a weak arm when they have to make a throw?
INT rate in this case has an r-squared of .26 with NFL yards/attempt, which might seem low, but I actually think that’s pretty high. First, NFL draft position doesn’t even explain very much of Y/A. Second, this is just one college stat. We can go back to the database to get more information and look for additional relationships.
Int Rate facing 2nd/3rd/4th Down and 8 or More to Go, vs. Top 35th Percentile Defenses
|Zac Dysert||Miami (Ohio)||61||1.6%|
|Geno Smith||West Virginia||240||1.7%|
|Mike Glennon||North Carolina State||301||3.0%|
|EJ Manuel||Florida State||230||3.9%|
Using the filter that I describe above also changes the correlation matrix that I posted earlier. Several of the correlations improved dramatically. The new correlation matrix looks like this:
I think it is worth nothing that some amount of E.J. Manuel’s value is related to his ability to run, so I don’t think looking at passing stats only will reveal all of his value.