I’m midway through Leonard Mlodinow’s book “The Drunkard’s Walk“, which I highly recommend, and I stumbled across an interesting passage about sports that made me think about all the data that us graduate students depend on so heavily.
The book details the prominent role of randomness in our lives, from the ‘luck’ of winning the lottery to the ‘skill’ required to ace an exam. The passage here describes how the laws of probability dictate that inferior sports teams will often win championships:
“…in a 7-game series there is a sizable chance that the inferior team will be crowned champion. For instance, if one team is good enough to warrant beating another in 55 percent of its games, the weaker team will nevertheless win a 7-game series about 4 times out of 10. And if the superior team could be expected to beat its opponent, on average, 2 out of each 3 times they meet, the inferior team will still win a 7-game series about once every 5 matchups. There is really no way for sports leagues to change this. In the lopsided 2/3-probability case, for example, you’d have to play a series consisting of at minimum the best of 23 games to determine the winner with what is called statistical significance, meaning the weaker team would be crowned champion 5 percent or less of the time (see chapter 5). And in the case of one team’s having only a 55-45 edge, the shortest statistically significant “world series” would be the best of 269 games, a tedious endeavor indeed! So sports playoff series can be fun and exciting, but being crowned “world champion” is not a very reliable indication that a team is actually the best one.”
Wow, if sports, which seem so definitive in their outcomes, are this ambiguous, what are we to think about research? This effect of randomness makes any researcher nervous to draw definitive conclusions. It’s part of the business of science though, and that’s why we carefully design our experiments and use statistics to analyze our data. Statistics are our way to attempt to identify when observed differences are real and not just the product of random variation.
Random side note: the link to the Amazon listing of that book above reveals that a new paperback copy of the book qualifying for free shipping is cheaper than the electronic Kindle edition. So many issues to discuss about this in a future post… What are the advantages to having an E-book over a traditional one? What does this say about how we value trees? How are the monetary and environmental costs of transporting/delivering the book covered? Who gets all the extra money from the electronic version which would seem to have much less overhead publishing cost? Will e-textbooks ever catch on?