Wednesday, 22 November 2017

Making causal inferences in economics: Do better grades lead to higher salaries?

In a previous post I discussed the changing nature of the economics profession and the importance of achieving the experimental ideal in social science research. I briefly discussed the logic and even some methodological approaches that are useful in achieving randomization, or at least as-if randomization in order to make our treatment and control groups as similar as possible for comparison. In this post I'll use an example that I like to teach to students to illustrate how we can make causal inferences using natural experiment research designs. A quick reminder: natural experiments are not experiments per se. They only provide us a good way to exploit observational data to emulate an experimental setting. 

Let's use the very basic example and look at the relationship between student grades and earnings – a topic is usually heatedly discussed among students – do better grades result in higher salaries?

Consider the following correlation between grades and earnings shown on the figure below. It uses US data for wages of men and women by high school grade point average (GPA). The higher the GPA the larger the average salary for both men and women, however with a huge wage gap between them in each case (notice how an average woman with a maximum 4.0 GPA in high school still earns less than an average man with a 2.5 GPA - shocking!). The overall pattern is clear. It is very suggestive that higher grades cause higher salaries. But do they really? Or is there some other factor that might affect both grades in high school and salaries later in life? For example ability. More competent individuals tend to get better grades, and they tend to have higher salaries. It wasn't the grades in high school that caused their salaries to be higher, it was their intrinsic ability. This is what we call an omitted variable bias - an issue that arises when you try to explain cause and effect without taking into consideration all the potential factors that could have affected the outcome (mostly because they were unobservable). 

Source: Washington Post
The best way to determine whether there is in fact a causal relationship between the two would be to run the following hypothetical experiment (assuming we posses divine powers). I take a group of students and given them a distinction grade, and then I repeat history and give the same group a non-distinction grade (a sort of parallel universe). In each case we observe the final outcome with two paths of history and we compare the difference in salaries between the first scenario and the second one. What we just did is we constructed a counterfactual – what would the outcome be if history played out differently. What would be the outcome if Neo took the blue pill instead of the red one (Matrix reference)? What would my salary be if I did not go to college? Would the US attack Iraq if Gore won the 2000 election? We cannot really answer any of these questions.

However, we can answer the question of will I get a higher salary if I have better grades. Even if we never have the same student with a high grade and a low grade.

What would be a better, realistic and metaphysically possible way to examine this relationship? We can take comparable students. Twins, if we’re lucky. Genetically identical individuals, with the same upbringing, same income, etc. We give one a distinction grade and the other one a non-distinction, and see how they end up in life. Even though this would satisfy our experimental evaluation problem, there is a clear ethical problem here as we cannot interfere with people's lives just for the sake of proving a point. 

What’s next? We can use the existing data on school performance and the later salaries of its students. Remember, in order to make a good inference we need to have comparable groups of students. So students that are very similar to one another in their ability, except that one group got a distinction, and the other just barely did not.

We can do this in two ways. One would be to simply match the students into comparable groups based on all of their pre-observed characteristics: gender, parental income, parental education, previous school performance, whatever we can think of and measure. This is called a matching strategy and it requires us to be able to measure all the characteristics that could affect student performance later in life. The only difference between our two groups will be their grades. If we do that successfully we can compare the outcome in two very similar and thus comparable groups, and see if better grades resulted in higher salaries.

However there is again the problem of measuring something like innate ability. If you cannot measure it then even the results from a perfect matching exercise could still be biased.

In order to rid ourselves of any unobservable variable that can mess up our estimates we need to impose randomization to our two groups – ensure that there is random assignment into treated and control units. Why? Because randomization implies statistical independence. In other words when we randomly pick who will be in the treatment and who will be in the control group, we make sure that the people in each group are statistically indistinguishable one from another. Any difference in outcomes between the two groups should be a result of the treatment (in this case better grades). 

But what if we cannot randomly assign students? In that case we use a neat trick to make sure that we get an (as-if/as good as) randomized sample. We utilize the threshold of getting a distinction grade (you need 70 to get a distinction in the UK) and we compare students just above and just below this threshold. If you get 70 you get a first in your degree. If you get just marginally below, 69, you get a second. The idea is that students in this very small margin, say between 68 and 72 are not really all that different one from another. In other words, they are perfectly interchangeable – a person scoring 69 is just as good as someone getting a 70, but he or she was just unlucky.

So how do we get to our conclusion? Consider the following artificially created picture. We observe only a narrow group of students around the threshold, in particular between the grades 69 and 71. We assume that all students within this group are comparable based on their inner ability so that we can control for all those unobservables we cannot measure. Then we compare the average earnings for those just above the 70 threshold (the treatment group, everyone from 70 to 71) to those just below the threshold (control group, everyone who got 69). If there is a large enough jump, discontinuous jump over the 70 threshold where those awarded a distinction have statistically significant higher earnings than those who just barely failed to make it to the distinction grade, then we can conclude that better grades cause higher salaries. If not, if there is no jump and the relationship remains linear, then we cannot make this inference.

This graph does not show the real relationship between grades and earnings. I generated it artificially to prove a point. When such a discontinuity does exist between the control and the treatment group we can conclude that there is a causal effect of grades on earnings because we are comparing statistically similar individuals within a very narrow interval around the threshold. In this made-up example a person that gets a distinction would have about 30% higher earnings than a person that just barely failed to get a distinction.

However the actual data shows no such jump. The relationship really is linear (as suggested by the first graph). How do we interpret this? We simply say that the same things that make students perform well in school (like ability) make them get higher salaries later in life. There is therefore no implicit and causal impact of grades on higher earnings, but it is suggestive that the same thing that's driving you to perform well in school will be driving you to perform well later in life. Encouraging, isn't it?

Finally, the point of this exercise was not to infer causality between grades and earnings, but to emphasize how one should think about conducting natural experiments in social sciences. To think about issues in this way does not require too much technical skills nor a particularly profound methodological breakthrough. It simply requires a change in the paradigm of drawing explicit conclusions from correlations and trends, something that economists in particular love to do. 

No comments:

Post a Comment