# Using Machine Learning to Create Daily Fantasy Football Projections for the Wild Card Round

RotoDocĀ holds a Ph.D. in mathematics and tackles sports data to give you a competitive edge in your fantasy games.

The regular season is over. For many, that means a serious case of fantasy football withdrawal. But for certain die-hard types like me, there’s still three more weekends of daily fantasy left before we turn our attention toward the 2015 season. With a full season of data atĀ my disposal, I took the time to create a data-driven model that projects fantasy productionĀ to aid me in creating my lineups.Ā Projections areĀ often overlooked by the novice player, but are extremely important in daily fantasy becauseĀ normalizing expected scoringĀ to a player’s salary helps us find cheap value playsĀ or let us differentiate a single playerĀ from a group of similarly priced players.

## Random Forests

The model I created uses a machine learning algorithm called a random forestĀ thatĀ is trained on the regular season dataĀ and then projected forward to this weekend’s wild card games. Once we have those projections, we can then use them in conjunction with our favorite daily fantasy site’s player salaries to find the top valueĀ plays for this weekend’s games.1

The model takes into account all sorts of factors, the most important being recent fantasy production. To find the optimal number of games for theĀ recent history data point, I took the average fantasy production over a players last X games, and plottedĀ it against actual fantasy points. The table belowĀ shows that the average production over the player’s previous sixĀ games was the best predictor of actual fantasy production, as measured by the R-squared value of a simple linear regression.

 Last 1 Last 2 Last 3 Last 4 Last 5 Last 6 Last 7 Last 8 R-Square 0.190 0.250 0.277 0.293 0.302 0.309 0.307 0.298

This result is actually quiteĀ intuitive. The most recent game or two is usually not enough to accurately determine future production, but as we climb back farther into a player’s history the predictive power increases. However, when we goĀ back too far, older results become less predictive, presumably for reasons such as personnel and scheme changes.

However, I don’t throw out all that other useful data. I also use data from more recent weeks to reflect any possible changes in usage, and I also use the average production overĀ all games for the whole season as input factors into the model. When combined withĀ some other metricsĀ including match-up quality, Vegas lines, etc. we get a highly predictive model that we can use for fantasy purposes.

This model isn’t an end-all be-all projection, rather, it’s another data point for you to use when creating your rosters. I still rely on the various GLSP projection apps as another data point, especially because it’s great at filtering out irrelevant games, something I’m working toward improving with my model. However, I am pleased with its fit toĀ the season-long data it was trained on. Note: the distribution of fantasy points is highly skewed. I use a Box-Cox transformationĀ to normalize the data, as displayed in the graph below. I then transform it back to the correctĀ point system for the final projections. For the whole 2014 season, the model predicts around 60 percent of the variation in fantasy points forĀ all players, despiteĀ things like injuries and unexpected depth chart replacements undoubtedly throwing off some of the data points. It’s not as predictive on new data, explaining ~30 percent of the variation, but it’s a work in progress.

The model has a few other limitations, all of which I’m working on perfecting for next year to provide even greater predictive value. I need to incorporate injury status updates, depth chart replacements, and a data driven filter much like the GLSP projection apps use to better filter relevant games. Additionally, my model is currently tuned for DraftKings scoring, since that’s the site I mainly use. I will be quickly implementing a more flexible version to accommodateĀ any scoring system.

## Projection Picks

So how does the model view this weekend’s games? For one, Cole Beasley looks to be a great value pick if you’re playing in a GPP. He’s averaged nearly 12 DraftKings points per contest over his last six games, has an average match-up, is favored at home with a fairly high expected point total of 27.75 points, and costs a mere \$3700. Antonio Brown also appears as a must-play in both GPP and cash game format against a Baltimore secondary that on average allows WRs to addĀ 4.5 more points to their season average.

Where does the model fall short? I believe Greg Olsen is underrated by the model, being punished for lackluster production the past two weeks as well as a low scoring team total of 21.75 points. Arizona is absolutely awful against the tight end position, and I expect Cam Newton to lean on Olsen in the passing game.Ā With that said, the model still projects him as the highest scoring TE in the wild card round, just not in points per thousand dollars (P/\$). Since we can find value elsewhere, I’llĀ probably haveĀ significant exposure toĀ him in allĀ formats. It’s also projecting Le’Veon Bell as if he is fully healthy. We know that’s not the case, and as we get some clarity on his status, I’ll apply an adjustment to his projection.

NamePositionTeamOppDK SalaryProj. PtsP/\$
Antonio BrownWRPITBAL890027.843.128
Le'Veon BellRBPITBAL900023.682.631
Andrew LuckQBINDCIN880021.452.438
DeMarco MurrayRBDALDET880020.862.370
Dez BryantWRDALDET850020.842.452
Calvin JohnsonWRDETDAL880020.372.315
Matthew StaffordQBDETDAL720020.082.788
Tony RomoQBDALDET750019.092.546
A.J. GreenWRCININD800018.152.269
Andy DaltonQBCININD610018.092.965
T.Y. HiltonWRINDCIN780018.042.313
Ben RoethlisbergerQBPITBAL730017.942.458
Joe FlaccoQBBALPIT670017.472.607
Cam NewtonQBCARARI760015.952.099
Jeremy HillRBCININD620015.832.553
Greg OlsenTECARARI550013.242.408
Michael FloydWRARICAR520012.912.483
Torrey SmithWRBALPIT540012.832.376
Cole BeasleyWRDALDET370012.793.457
Joique BellRBDETDAL550012.672.303
Jason WittenTEDALDET450012.332.740
Jonathan StewartRBCARARI530012.292.319
Kelvin BenjaminWRCARARI590012.202.068
Giovani BernardRBCININD560011.802.108
Justin ForsettRBBALPIT610011.101.819
Coby FleenerTEINDCIN500011.022.203
Heath MillerTEPITBAL390010.562.708
Jermaine GreshamTECININD37009.682.615
Golden TateWRDETDAL55009.601.745
Martavis BryantWRPITBAL41009.282.263
Ryan LindleyQBARICAR52009.021.735
Kamar AikenWRBALPIT30008.312.771
Donte MoncriefWRINDCIN36008.002.223
Reggie BushRBDETDAL48007.081.474
Markus WheatonWRPITBAL31007.072.282
Philly BrownWRCARARI30007.062.353
Theo RiddickRBDETDAL32006.902.157
Jerricho CotcheryWRCARARI34006.902.029
John BrownWRARICAR41006.731.641
Terrance WilliamsWRDALDET34006.691.968
Marlon BrownWRBALPIT30006.522.173
Owen DanielsTEBALPIT33006.451.954
Jeremy RossWRDETDAL30006.442.146
Zurlon TiptonRBINDCIN31006.262.021
Hakeem NicksWRINDCIN33006.011.822
Mohamed SanuWRCININD42005.811.383
Kerwynn WilliamsRBARICAR37005.781.561
Larry FitzgeraldWRARICAR47005.331.135
Ryan HewittTECININD30005.261.753
Joseph RandleRBDALDET30005.171.724
Ed DicksonTECARARI30005.171.724
Reggie WayneWRINDCIN43004.971.155
Jaron BrownWRARICAR31004.931.591
Stepfan TaylorRBARICAR30004.881.628
Eric EbronTEDETDAL30004.581.526
Trent RichardsonRBINDCIN34004.541.336
Darren FellsTEARICAR30004.401.467
Crockett GillmoreTEBALPIT30004.191.397
Dwayne AllenTEINDCIN38004.021.059
Michael PalmerTEPITBAL30003.971.323
John CarlsonTEARICAR30003.971.322
Justin BrownWRPITBAL30003.931.310
Darrius Heyward-BeyWRPITBAL30003.931.310
Brandon TateWRCININD30003.921.308
Derek AndersonQBCARARI50003.870.773
Fozzy WhittakerRBCARARI30003.791.262
Kevin BrockTECININD30003.551.184
Jacoby JonesWRBALPIT30003.481.161
James WrightWRCININD30003.441.147
Logan ThomasQBARICAR50003.390.678
Lance MooreWRPITBAL30003.261.088
Brandon WeedenQBDALDET50003.080.616
Rob HouslerTEARICAR30003.061.019
Greg LittleWRCININD30003.031.011
Phillip SupernawTEBALPIT30003.021.006
Michael CampanaroWRBALPIT30003.011.003
Dane SanzenbacherWRCININD30003.001.001
Joe WebbQBCARARI50002.990.597
Matt SpaethTEPITBAL30002.870.955
Brandon WilliamsTECARARI30002.750.918
Jed CollinsRBDETDAL30002.680.892
Ryan BroylesWRDETDAL30002.650.882
Corey FullerWRDETDAL30002.530.843
Brandon PettigrewTEDETDAL30002.530.842
Marion GriceRBARICAR30002.520.839
Jack DoyleTEINDCIN30002.510.835
Fitzgerald ToussaintRBBALPIT30002.370.789
Bernard PierceRBBALPIT30002.370.788
Brenton BersinWRCARARI30002.320.772
Josh CribbsWRINDCIN30002.250.751
Devin StreetWRDALDET30002.130.710
Jason CampbellQBCININD50002.080.417
James HannaTEDALDET30001.970.657
Gavin EscobarTEDALDET30001.910.636
Dwayne HarrisWRDALDET30001.840.613
Robert HughesRBARICAR30001.760.587
George WinnRBDETDAL30001.720.574
Lance DunbarRBDALDET30001.700.566
Cedric PeermanRBCININD30001.600.533
Kyle JuszczykRBBALPIT30001.440.479
Mike TolbertRBCARARI30001.170.391
Will JohnsonRBPITBAL30001.110.369
Tyler CluttsRBDALDET30001.010.337
Dri ArcherRBPITBAL45000.940.209
DeAngelo WilliamsRBCARARI40000.860.215
Josh HarrisRBPITBAL35000.720.205