One of the most important lessons of economics is that correlation does not imply causation. Just because we see a pattern in the data, or we observe one event following another, does not mean one event CAUSED the other. In order to prove (abiet partially) a causal relationship you need to observe the two effects while controling for a whole number of possibilities that can affect both. This is why econometrics is used in social sciences which tend to suffer from a deficiency of clearly proving a certain hypothesis.
Keeping this in mind we have a look at this week's graph of the week, depicting the alleged relationship between wine consumption and student achievement. The graph was originally produced by a Cambridge graduate, and was used by the Economist:
|Source: The Economist|
The graph seems to point to an interesting positive correlation between how much does a Cambridge or Oxford college spend on wine to the percentage of students gaining a first class degree. This would make us conclude that colleges in which students drink more, their students perform better as well.
That would be a false conclusion.
There could be a whole number of drivers behind this relationship. To make such a conclusion just from such a simple correlation we would inadvertently suffer from something econometricians call the omitted variable bias. An omitted variable bias implies that the causal relationship between the two variables we are observing (independent, Y and explanatory, X) could be subject to some unobserved (unincluded) variables (effects). For example, one can find that lower class size leads to better grades of its students. But perhaps performance is driven by other socio-economic characteristics. For example, schools with lower class size may be schools in wealthier neighborhoods or may have better quality teachers. And while controlling for teaching quality is hard, controlling for income in a neighborhood is rather easy.
Consider again the above correlation. First of all such a simple graph doesn't take into consideration the number of students at a particular college (i.e. size of college), nor does it evaluate the relationship if per-capita expenditure on wine instead of total expenditure is compared with performance, as many commentators have noticed under the Economist's original post. It also fails to control for wealth of a college or gender (assume that male-dominated colleges drink more than women-dominated). I would easily assume that gender plays a big role, as well as college size, of course. So a good empirical analysis of this potentially interesting issue would include all those controls and make several robustness checks in measuring the main explanatory variable as well as the main dependent variable (try total alcohol consumption for example), before it would be viable to conclude of any relationship between wine consumption and academic performance.
Being an empiricist is an exhausting job. It's not just crunching numbers.