Reading Time: 5 minutes

It’s Halloween, and this image is being shared around again:

A jack-O'-lantern that has, "Inferring Causation from Correlation" carved into it, joking that inferring causation from correlation is scary

I think the jack-O’-lantern is genuinely hilarious. The notion of avoiding this inference is certainly a spooky thing to be avoided in science, making for a funny pumpkin, but the statement doesn’t hold absolutely. I wrote a post on it last year, talking about statistical inference. The gist of it is that while we aren’t justified in inferring causation just from a statistical correlation, our scientific knowledge and models do require us to look at correlations and draw inferences from them. In one sense, we have to infer causation from correlations, but there’s a lot more work involved than merely seeing how two variables interrelate on a plot.

In the past year, I came across a good example of why the phrase “don’t infer causation from correlation” is reductive and not absolutely true, and how the phrase can be weaponized.

One of my favorite podcasts is Be Reasonable, which is hosted by Michael Marshall (Marsh). Every episode Marsh interviews some wacky pseudoscientist or conspiracy theorist with bizarre beliefs, ranging from antivaxxers to flat earthers to paranormal investigators. They aren’t debates, as Marsh’s pushback usually comes in some light questioning and prodding. Listeners often describe the episodes as hilarious, baffling, frustrating, and illuminating due to the nature of his profoundly mistaken guests. It’s the audio equivalent of a train wreck you can’t look away from.

In May of this year, Marsh interviewed Michael Fullerton, a 9/11 truther. Fullerton is a particularly interesting truther, because he doesn’t merely dispute that the airplanes that crashed into the World Trade Center caused the buildings to collapse, he explicitly states that there is no evidence at all that the airplanes that hit the building caused the towers to collapse. He rejects wholesale that anyone could explain the collapse of a skyscraper on the basis that an airplane could damage the tower’s integrity enough for it to collapse. A brief excerpt from the episode:

Just because something happens before something else happens, that doesn’t mean that they’re related. And if you’re going to say that they’re related, that’s fine, but you need to provide evidence that they’re related. You can’t just say, “well, they might be related, so let’s just believe it”.

Of course, we do have evidence that a plane could cause a building to collapse. We know, for example, that imparting a large amount of kinetic energy upon an object causes the object to be damaged. We also know that structures that have been damaged often fail, and the failure of a tall structure means that the structure will transition from a state of high gravitational potential energy to low gravitational potential energy, and fall.

In the scientific projects that support these notions, we have very controlled systems. We can measure that the more kinetic energy we impart upon an object, the more it becomes damaged (think of a car accident). This is a correlation, and we can infer that ramming objects into other objects causes damage to both of the objects. And we are justified in inferring this correlation from causation, because the phenomena are pretty simple and straightforward consequences of Newtonian physics, and any alternative explanations for why two cars have become damaged after colliding become unsatisfactory.

Of course, the inference that the planes caused the towers to collapse is a Bayesian inference, such that I have some degree of certainty that the planes caused the towers to fall. If I wanted to create a frequentist correlation between plane crashes and collapsing World Trade Center towers, I would set up a statistically large number of identical buildings and crash planes into them and see what happens. This would be a fairly difficult experiment to get sufficient grant money for, but it would be fairly obvious that some number of buildings would collapse after being damaged by the plane. I would be able to correlate the time of impact with the time of collapse for multiple events. Ultimately, an inference of causation from correlation in this case would be perfectly reasonable.

In everyday life, inferring correlation from causation is what we do everyday, and for the most part it works. My dryness at the end of a rainy day is a function of which clothes I chose to wear, and if I wear the same waterproof jacket on multiple rainy days it’s reasonable to infer that the jacket is what prevented me from getting wet. How fast an egg cooks correlates with how hot the stove is. Sometimes we get this inference wrong by thinking that pushing the elevator button multiple times helps or that wearing our lucky socks will cause our football team to win, but ultimately it tends to work in a pragmatic sense. In the lab when we are looking at more complex systems with multiple unknown potential causes, premature inference from correlation should be avoided.

The problem here is that Fullerton is weaponizing the common wisdom to not infer causation from correlation, or rather appealing to the post hoc ergo propter hoc fallacy which is similar but more appropriate for how we should analyze a single event. By appealing to this, he seems to imply that even though one happened right after another, and we have a very good physical model of what towers might do when they are damaged, we can’t say that planes caused towers to collapse. Of course, it seems to me that by this line of thinking we can’t really talk about alternative explanations for the event. After all, by the same reasoning just because some C4 charges exploded in a building and then the building fell doesn’t mean that the C4 caused the tower to collapse.

I kind of find Fullerton’s argument amusing, and helpful in a sense, because it is a reductio ad absurdum for avoiding this type of inference. We can’t take the statement to its extreme, because then we get to absurd conclusions, like saying that there’s no evidence that a plane can collapse a building by hitting it. A more realistic 9/11 truther would have to admit that there is evidence, they simply think that their model is a better description of reality.

It’s clear that this phrase can, in the wrong hands, be weaponized. There is a correlation between smoking and lung cancer, and we wouldn’t accept “not inferring causation from correlation” from a tobacco company. There is a correlation between the amount of CO2 in a gas sample and the amount of infrared heat absorbed, and from that we can rightly infer that our planet will heat up due to more carbon emissions. Admittedly, in these cases there’s far more than a single correlation graph supporting that smoking causes lung cancer and that human emissions cause global warming. However, in the papers that look at these phenomena you will find the researchers drawing inferences from correlations, corroborations from independent researchers, and multiple additional experiments that show no correlation with potential confounding factors.

In the cases of smoking, global warming, and vaccines, there are folks with vested interests in your disbelief of valid conclusions. By trying to decouple very plausible effects between two phenomena, they abuse a very useful tool. Ideally, the tool prevents us from finding patterns that don’t exist. In bad hands, this sows distrust, and actually causes us to see the world less accurately, and not more accurately.

Avoiding jumping to hasty conclusions is a good thing, and we should always be aware that just because two things correlate doesn’t mean one thing causes another. The rooster’s crowing obviously does not cause the Sun to rise. But we aren’t bound to pure agnosticism when a correlation exists. If a correlation exists and it comes alongside a reasonable explanation, and perhaps some other non-correlations exist that dispute alternative explanations, then “don’t infer causation from correlation” ceases to be useful and may in fact become a barrier to better understanding. In many, many cases, the correlation is there for a reason, and we shouldn’t allow charlatans to weaponize the concept to produce doubt where doubt is unwarranted.

Happy Halloween!