Last night I read an awesome article about the wine tasting world that debunked the idea that there is really such a thing as a refined wine palette. Don’t get me wrong, wine tasting can still be fun although a good amount of the fun that I derive from the activity is to loudly announce that I think the juice tastes like church wine. This generally works with any wine and it’s a hit with whoever has served you the wine assuming that they have a sense of humor, which (in hindsight) I now realize they rarely do. If you go with the church-wine-stand-by-line and then mix in a few references to Boone’s Farm, the combination of self satisfaction of having come up with such witty material, along with the wine buzz that sneaks up on you, will make any wine tasting enjoyable.
Where were we? Oh right, football and wine tasting. This excellent article on the topic of wine tasting led to a series of eureka moments for me on the difference between perception and reality as it relates to experts. Let’s dive in:
A recent New Yorker piece describes a followup to Brochet’s 2001 study, wherein he served wine experts a run-of-the-mill Bordeaux in two different bottles:
One bottle bore the label of a fancy grand cru, the other of an ordinary vin de table. Although they were being served the exact same wine, the experts gave the bottles nearly opposite descriptions. The grand cru was summarized as being “agreeable,” “woody,” “complex,” “balanced,” and “rounded,” while the most popular adjectives for the vin de table included “weak,” “short,” “light,” “flat,” and “faulty.”
Holy shit. I’ve been saying for some time that I think that football experts have a difficult time distinguishing between prospects based on the uniforms they wear. The example I’ve used in the past is that Greg Little was compared to Hakeem Nicks coming out of college even though they didn’t resemble each other at all. I’ve also pointed out that Stevie Johnson and Reggie Wayne were similar coming out of college except that Wayne had played at Miami and Johnson played at Kentucky and that probably affected their draft grades.
More from the article:
In 1996, research published in the Journal of Experimental Psychology concluded that wine experts cannot reliably identify more than three or four of a wine’s flavor components. Most wine critics routinely report tasting six or more. The wine review excerpted in the top image for this post, for example (which is a real review, by the way – somebody actually wrote those words about a bottle of wine, in earnest) lists the following components in the wine’s “principle flavor” profile: “red roses, lavender, geranium, dried hibiscus flowers, cranberry raisins, currant jelly, mango with skins [Ed. note: jesus wine-swilling christ – mango with skins?], red plums, cobbler, cinnamon, star anise, blackberry bramble, whole black peppercorn,” and more than a dozen other flavors that I refuse to continue listing lest my head implode.
What the article is saying is that experts multiply their actual understanding when describing the subject matter. They can identify three or four things, but then they somehow turn those 3 or 4 things into 20 things. Does that sound familiar? How about when evaluating a prospect whose principal upside could be described as “had a pretty good junior season” and then exploding that into talking about ability to keep eyes downfield, ability to throw the ball away, pinpoint accuracy in the 5-15 yard range, ability to use pump fakes and shoulder fakes to manipulate defenders, consistent willingness to overthrow receivers rather than underthrow them (I’m pulling these from an actual evaluation in case you haven’t already guessed), operates the offense with a good tempo, etc. Care to guess who that was written about? You know it’s going to be bad right? I think “had a pretty good junior season” would have been a decent way of summarizing Blaine Gabbert’s chances for NFL success, but I guess it’s also possible to keep digging until you find all manner of other attributes. Kind of like finding hints of dried hibiscus flowers.
I’m not saying that none of those attributes could exist in a football player, or even that when the eval was written on Gabbert that it was inaccurate. My argument would be that 75-80% of our understanding on a football player can be boiled down to what you might call principal components, but that experts often have a flair for multiplying those principal components endlessly. As Occam opined, “entities must not be multiplied beyond necessity”. This idea that things can be boiled down to their principal components is one reason that I don’t worry about our statistical based approach which is very light on game watching and very heavy on finding statistics that can predict some portion of the future accurately. Those stats are our principal components. They might be missing things like whether a running back has good vision behind the line of scrimmage, but we’re just assuming that whatever the player’s attributes are, they’ll show up in the box score. When I say assume, that’s the assumption as it relates to theory. As it relates to our models, we try to be transparent as to whether or not we’ve backtested the stats we’re talking about.
To explain the idea of principal components in another way, think about all of the things that were said about Cam Newton coming out of college (especially related to his character) and how they were probably all right. But it didn’t matter for Newton’s chances of success in the NFL. He has the same Adjusted Yards/Attempt as Matt Ryan over the past two seasons, helps his team running the football and is in general one of the most underrated players in the league. All of the bad stuff said about him was probably true and also was irrelevant. His odds for success were related to a very small number of his attributes (principal components) and the rest of the information was noise.
Let’s go back to the wine article:
Statistician and wine-lover Robert Hodgson recently analyzed a series of wine competitions in California, after “wondering how wines, such as his own, [could] win a gold medal at one competition, and ‘end up in the pooper’ at others.” In one study, Hodgson presented blindfolded wine experts with the same wine three times in succession. Incredibly, the judges’ ratings typically varied by ±4 points on a standard ratings scale running from 80 to 100. Via the Wall Street Journal:
A wine rated 91 on one tasting would often be rated an 87 or 95 on the next. Some of the judges did much worse, and only about one in 10 regularly rated the same wine within a range of ±2 points.
Mr. Hodgson also found that the judges whose ratings were most consistent in any given year landed in the middle of the pack in other years, suggesting that their consistent performance that year had simply been due to chance.
Several articles have come out recently on the amount of skill associated with an NFL GM position and they’ve basically said that what we perceive as skill might just be random variance. I think one criticism I would have of those studies is that they don’t address the potential that rival teams are co-opting the strategies of successful teams and thus flattening any advantages that a team with a draft edge might have. But I would be naturally inclined to believe that things are more random that we often assume. Wine critics can’t reproduce the same ratings on the same wine. I wonder whether football evaluators would be any good at reproducing consistency in their rankings. I kind of suspect that they wouldn’t. One argument I might offer to advance this theory is that football evaluators, for all of their skepticism of stats, often just follow the stats. A player that has a breakout college season (statistically usually) flies up draft boards even though it’s the watching of football games that is supposed to discover whether players have the skillsets that will make them good pros. So what happens to that evaluation between a lackluster junior season and a breakout senior season? The player either developed new skill sets, or they were there all along and simply went undiscovered. To look at the problem in reverse, isn’t Matt Barkley at least as good of a QB now as he was when he was regarded as the top prospect a year ago? He has another season under his belt, so he’s more experienced. But then why did the evaluations change? Did he lose abilities? Were they robbed of him like kryptonite robs Superman of his powers? Remember, you’re supposed to be able to know whether a player is good by watching games, any number of evaluators watched Barkley last year and decided he was good, then watched him this year and decided he wasn’t. We can’t reproduce the wine experiment with football, but I think I can make the case that the idea that a player has some inherent football ability (objective in nature) and can be verified with film study is problematic at best.
The wine world’s Bill James was a professor of economics named Orley Ashenfelter who showed that he could better project wine prices by using a regression, than could wine experts who were allowed to taste young wines. Ashenfelter was met with outright hostility in the wine world from those who were offended that he would try to make objective something that they thought was subjective and required a level of expert knowledge. Sound familiar?