Association versus causation

Steve Simon


Dear Professor Mean, Everyone says that smoking causes cancer, but we can’t really say that, can we? There is an association between smoking and cancer, but we know that association does not imply causation, don’t we?

Dear Reader, You point out that an association is not always causation. This is indeed correct, but sometimes association IS causation. For example, standing under trees during a storm is associated with getting struck by lightning. If you think that association can never imply causation, I invite you to stand under a tall tree the next time there is a storm.

No, no come back here. I didn’t mean it. You do pose an interesting question: when does association imply causation?

Short explanation

A good review of when we can make the leap from association to causation is a 1965 article by Sir Austin Bradford Hill “The Environment and Disease: Association or Causation”. He mentions nine factors:

  1. Strength (is the risk so large that we can easily rule out other factors)

  2. Consistency (have the results have been replicated by different researchers and under different conditions)

  3. Specificity (is the exposure associated with a very specific disease as opposed to a wide range of diseases)

  4. Temporality (did the exposure precede the disease)

  5. Biological gradient (are increasing exposures associated with increasing risks of disease)

  6. Plausibility (is there a credible scientific mechanism that can explain the association)

  7. Coherence (is the association consistent with the natural history of the disease)

  8. Experimental evidence (does a physical intervention show results consistent with the association)

  9. Analogy (is there a similar result that we can draw a relationship to)


Let’s look at these nine factors with respect to smoking.

  1. There certainly is a strong association.

  2. The results are consistent across a wide range of researchers and studies.

  3. Smoking is not specific as it is associated with a wide range of diseases besides cancer.

  4. Smoking precedes the diagnosis of cancer, sometimes by several decades. By the way, some statisticians did try to make an argument that cancer causes smoking (people who have a genetic predisposition to cancer are the ones who are readily attracted to and addicted by cigarettes), but that argument was easily demolished.

  5. There is a dose response relationship (heavy smokers are at greater risk than light smokers).

  6. I’m not sure if we know enough about cancer to provide a specific mechanism. Perhaps we can invoke some of the ideas we know about how certain chemicals can cause DNA damage.

  7. There is coherence in that lung cancer rates rise with smoking rates and the rates are higher in countries where a lot of people smoke.

  8. Experimental interventions do work. Getting people to quit smoking has been shown to greatly reduce their risk for cancer.

  9. I’m not aware of analogy to smoking. Perhaps the fact that cigar smoking and pipe smoking are associated with cancers of the mouth.

So that’s anywhere from six to eight out of nine, which makes a convincing case for causation. Someone who knows more about the mechanisms of cancer could comment about plausibility and about analogous relationships.


There are nine conditions that you should examine when deciding whether statistical association implies causation.

  1. strength,
  2. consistency,
  3. specificity,
  4. temporality,
  5. biological gradient,
  6. plausibility,
  7. coherence,
  8. experimental evidence, and
  9. analogy.

None of these criteria are perfect, but they give a useful guideline. As Sir Austin Bradford Hill himself notes:

“All scientific work is incomplete- whether it be observational or experimental. All scientific work is liable to be upset or modified by advancing knowledge. This does not confer upon a freedom to ignore the knowledge we already have, or to postpone the action that it appears to demand at a given time. Who knows, asked Robert Browning, but that the world may end to-night? True, but on available evidence most of us make ready to commute on the 8.30 next day."

Further Reading

You can find an earlier version of this page on my original website.