This isn’t anything new, unfortunately. All the time we see people making audacious claims backing them up with statistical analyses that are founded on fairly pedestrian statistical biases. While many of these biases can be nuanced (think heteroscedasticity, not selection bias), that’s precisely why research requires so much thought and care. A fundamental of research is checking and rechecking your assumptions to ensure that they’re all well-founded.
Why is this such a problem in sports research? It’s not much unlike the problem faced in climate science (see this paper for more): the output is completely tangible and, in terms of data mining, easy to understand. Everyone can see someone dunk a basketball, though they do not necessarily see the pick set to make it possible. Everyone in NYC knows that this has been a mild winter, but that says nothing about the combined temperature across the planet. Because the average reader can quickly understand these basic data points, they believe that they are able to freely analyze them and their more advanced “relatives.” Reading into even the most basic statistics can be damaging, which explains the struggles academics have trying to explore a topic with deep thought and a complete understanding of all previous literature.
What’s worse is that the problem isn’t getting better. For example, the fan’s choice for best research paper (what does that even mean!?) in the upcoming SSAC, Sloan Sports Analytics Conference, will be chosen by SportsNation. I’ll save my SSAC rant for next week, but thinking along the lines of an idiocracy, ESPN might move to change our peer-review system into a poll of what paper’s impact is the most awesome-est.