What researchers mean by... internal validity

In one current systematic review at the Institute for Work & Health, we are trying to answer the question, Do workplace interventions prevent injuries in the upper body? Once the reviewers have identified all of the relevant studies on this topic, they will judge the quality of each study.

A key aspect of quality is the internal validity of a study. Internal validity, in essence, is whether the study's findings result from the intervention being studied, and are not due to chance or some other factor. You could also say that internal validity is how well the study was set up and executed to prevent systematic errors or bias (see previous column on bias in fall 2007 At Work).

Let's take a fictional example to see how this plays out. Suppose researchers wanted to study the effectiveness of an ergonomic program that included staff training. The program was targeted at garment workers, who often experience wrist pain. In the study, the workers in one factory completed a test of their knowledge of postures to prevent wrist pain. Then an ergonomic program and training were introduced. Six months later, fewer workers reported pain symptoms and when tested again, their scores were better.

At face value, this sounds like a promising program. But in reality, something else could have caused these changes. A study with strong internal validity would be set up in a way that ruled out other explanations.

The review team uses a detailed list of questions to ensure the researchers have considered these other causes and minimized bias. Here are some things the reviewers would be looking at:

  • Did the researchers use a control group of workers who didn't participate in the program? A control group provides a way for researchers to see if the program led to the changes, as they can check whether any changes occurred in the control group.
  • What else was happening in the workplace that might explain the results? For instance, suppose a staff ergonomist was hired after the program began. This might account for the improvements and would need to be considered.
  • Was it possible that workers, over time, became more knowledgeable about preventing injuries on their own?
  • Did completing the first knowledge test affect results the second time around?
  • Were the workers given the same test, in the same way, both times?
  • Who dropped out of the study before it ended? Maybe some workers withdrew because their pain symptoms weren't getting better. Any improvements in pain in workers remaining in the study wouldn't reflect the whole truth. The researchers need to look at the reasons that people dropped out, to see if this is an issue.
  • How were workers chosen to participate in the study? The researchers need to report on how they selected the groups, and the differences between groups. If the workers who did the program volunteered, they may be more highly motivated and it would affect the findings.
  • What was the average rate of reported pain before the program? Suppose the factory's management agreed to the program because in the previous year, reports of pain and work absences increased dramatically, far above the average rate each year. However, these rates may fluctuate naturally, from year to year. So the improvement may just mean the rate is coming back to the average.

Internal validity is also influenced by the way that people naturally interact. For instance, if workers in the control group found out about the program, they might try to do something similar themselves. Or, management may decide that having a control group is creating too many problems among employees, and may allow these workers to access the program or create a new one for them.

All of these scenarios show how difficult it can be to do research in workplaces. They also show how important it is to have a well-designed study when you're trying to find out if a program really works.

Overall, the higher the internal validity, the better the quality of the study. And the more sure we are that the results are due to the program, and not due to something else.

Source: At Work, Issue 51, Winter 2008: Institute for Work & Health, Toronto