Difference in differences

About the “What researchers mean by...” series

This research term explanation first appeared in a regular column called “What researchers mean by…” that ran in the Institute for Work & Health’s newsletter At Work for over 10 years (2005-2017). The column covered over 35 common research terms used in the health and social sciences. The complete collection of defined terms is available online or in a guide that can be downloaded from the website.

Published: August 2016

Experimental studies are typically designed so that researchers can learn about the impact of an intervention (a drug, a therapy or a program). They do this by looking for different outcomes between the group that received the intervention (the intervention group) and the group that did not (the control group).

But what if the people in both groups start out with important differences to begin with? That’s when researchers use a method of analysis called difference in differences to identify the effect of the intervention.

In controlled settings such as a randomized controlled trial, study participants are randomly placed in either the intervention group or the control group. That step helps make sure that the groups start out relatively the same so that changes in the intervention group can be more easily attributed to the intervention. In natural experiments (or observational studies), researchers don’t have this ability to randomly assign participants.

That’s because, in natural experiments, the interventions happen naturally, as the name would suggest. For example, a study of a school board policy that requires all school students to be vaccinated, of a province’s policy to cut a cheque to everyone who lives in it so no one lives below a certain level of income, or of a town council decision to make helmets mandatory for all cyclists would all be natural experiments.

When such policies or programs are offered in one school board, one province or one town but not others, they offer researchers a valuable opportunity to study the impact of the intervention. But in natural experiments such as these, participants may start out with important differences; i.e. the people in the school board, province or town subject to the policy or program may already be different in some meaningful way from those with whom they are being compared. To overcome this, researchers don’t compare one group’s outcomes to those of the other. Instead, they look for how much each group changes over a period of time with respect to a certain outcome. Then they compare the extent of the change between the two groups.

An example of difference in differences

Let’s take the helmet bylaw as an example. If you as a researcher want to look at the effect of that bylaw—introduced by Town A, let’s say—you might hypothesize that it reduces head injuries. As a result, you take a close look at stats from emergency rooms to see whether head injuries from cycling accidents have gone down. For a control group, you look at similar stats in a neighbouring town of the same size—Town B—where a mandatory helmet bylaw does not exist.

But you know there may be prior differences between Town A and Town B. They may differ in road and traffic conditions or in how willingly people wear helmets when cycling, whether required by law or not. As a result, you don’t simply look at the two towns’ post-intervention stats—the number of head injuries one year after the bylaw took effect, for example—and draw a conclusion based on those two numbers. Rather, you also look at head injury stats prior to the bylaw in both towns. If head injury stats in Town A go down by 25 per cent but only by 15 per cent in Town B, you attribute that 10-per-cent difference to the effect of the bylaw.

This approach has some limitations. One is the possibility that you might be seeing regression to the mean. That would be the case if pre-bylaw injury stats in Town A were extreme or exceptional to begin with. If so, there’s a strong statistical likelihood that the extreme injury rates seen at that point in time would naturally decline towards a lower average.

Another caveat to this method is that it assumes injury trends for both towns would have been the same if not for the intervention. Even if you gathered data at multiple points in time to make sure that the trends were the same leading up to the new bylaw, you have to be alert to the possibility that something else might be taking place to change that trend during the period of your study.

Source: At Work, Issue 85, Summer 2016: Institute for Work & Health, Toronto