So you want to estimate an association...

Whenever someone asks me for advice and they tell me their research question is an association, I launch into some form of the conversation below. I cannot help anyone with their methods if they tell me they’re interested in estimating an association because that’s not a proper research question. And it’s kind of interesting that often enough when I push them a bit, their question isn’t necessarily causal either.

Epidemiologist: I want to estimate the association between exposure to antibiotics during pregnancy and asthma at age 5.

Me: Ok. If all you want to know is that association you can just compare the risk of the asthma at age 5 in children whose mothers took antibiotics to mothers did not take antibiotics. If the risk isn’t the same in each group you know they’re associated.

E: But that’s just the crude association which I’m not interested in. I want to know the association after adjusting for a this list of covariates I have here.

Me: What is the purpose behind adjusting for these covariates?

E: I think they’re confounders.

Me: Oh, so you’re trying to estimate the causal effect antibiotic use during pregnancy?

E: No. This is not a randomized trial so I can’t estimate the causal effect which is why I’m only estimating an association.

Me: Well, even randomized trials don’t necessarily estimate causal effects if there’s loss to follow-up or non-adherence among other issues but let’s put a pin in that idea for now. You’re telling me you want to know the association between antibiotic use during pregnancy and asthma after adjusting for confounders. It really sounds like you’re trying to estimate a causal effect.

E: Look, I do want to know the causal effect but I only have observational data so I can only estimate associations.

Me: Wait, so because you think any estimate you get would be a biased estimate of a causal effect, you’d prefer to just call it an association?

E: Yes.

Me: Ok. Let me ask you this. I saw on your last paper where you were also estimating an association, you said that there is always a chance of unmeasured confounding and therefore your association may be biased. Biased away from what?

E: Away from the true association.

M: How would you define the “true association”?

E: The association you would get if you adjusted for all confounders.

Me: Aren’t you just using the word “true association” to mean causal effect?

E: No, because I can’t estimate a causal effect because I’m using observational data.

Me: So your research question changes just because of the type of data you have?

E: Yes. We can’t let people say their estimates are causal because they will over-interpret their results which could be dangerous.

Me: But how should we interpret the results of studies that get biased estimates of “true associations”? For example, say you find an association where the risk of asthma in children who were exposed to antibiotics during pregnancy is twice that in children not exposed to antibiotics. What do you do with that information?

E: You say there should be more research done on the topic.

Me: I see two issues with that answer. By your definition we can only get causal effects from randomized trials so when you say more research should be done, I guess you’re proposing a randomized trial be done? But I doubt you’d ever have the equipoise to answer that question is pregnant women so the question is then unanswerable. The other issue I have with your answer is that your last paper estimated an association but the concluding paragraph contained policy suggestions. How can you make policy suggestions if you’re not estimating a causal effect.

E: Ok. You are making me see that my ideas are not fully consistent here. But what’s the harm in just letting people use the word “association”?

Me: Because I often hear people say things like, “I’m only estimating an association, not a causal effect” to defend a choice in their analysis which would, without a doubt bias the estimate of a causal effect. For example, adjusting for a mediator or not adjusting for a variable that is very likely a confounder. If people think they’re estimating an association instead of a causal effect, they feel more free reign to make bad analytical decisions.

E: But what do you suggest then? We call everything a causal effect?

Me: I suggest we don’t hide behind euphemisms. If you want the answer to a causal question, say so. But that doesn’t imply you have to interpret your results as causal. There will always be biases due to uncontrolled confounding, selection bias or information bias. If you’re being honest that your goal is to estimate a causal effect, you can then think more carefully about the ways your estimate is biased.

E: So you’re basically saying to state your causal question, use causal methods and in the discussion, carefully discuss the ways in which your estimate is biased from the true causal effect?

Me: Yup.

Jeremy A. Labrecque
Assistant professor, Epidemiology and causal inference

My research is on how we know what we know.