# The Logic of Instrumental Variables: Causal Inference Bootcamp

[MUSIC]. In other modules we’ve talked about
things like regression analysis and experiments to learn about causality. Why don’t we just
stop there, why isn’t that all we need to know? Well, experiments require treatments
to be randomly assigned to units, that’s often not the case. Regression analysis requires
us to measure every possible confounder, that’s often impossible. So what do we do if we can do an experiment
and can’t use regression analysis? Next, we’re going to talk about something called instrumental
variables analysis. It’s one of the oldest and most important ways for learning about
causality using observational data. Now it’s pretty complicated so pay really close attention. There are six steps involved in doing instrumental
variables analysis. Step one, we observe a variable, called the instrument, that is correlated
with the outcome variable. So remember we’ve got our outcome variable: the variable we
are trying to affect. We’ve got our treatment variable: the variable that we are interested
in learning the effect of the treatment on the outcome. Now we have a third variable called the instrument,
and when we look at our data it seems to be that the instrument is correlated with the
outcome, so units that have higher outcome levels of the outcome variable tend to have
higher levels of the instrument for example, or maybe it’s negatively correlated, units
with higher values of the outcome variable tend to have negative values of the instrument. Step two, we assume that the instrument does
not have a causal effect on the outcome variable, so the correlation that we see between the
instrument and the outcome is not because the instrument has a causal effect on the
outcome variable. Instead, that correlation is picking up the effect of some confounding
variable. Step three, we assume that the instrument
does have a causal effect on the treatment variable. So in step two, we assume the instrument
does not have a causal effect on the outcome but it does have a causal effect on the treatment. Step four, we assume that the instrument is
randomly assigned to units or is as-if randomly assigned. Step five, because of step four, the causal
effect of the instrument on the treatment variable is their correlation in the data.
So, here we’re thinking of a new randomized experiment where we randomly assign the instrument
to people and I want to know what’s the causal effect of the instrument on the treatment
variable. So because I randomly assigned it, I know that whatever correlation I see between
the instrument and the treatment in the data is the causal effect of the instrument on
the treatment. Step six, since the instrument is randomly
assigned by step four, it is not correlated with any other possible confounder except
for the treatment. So where does that leave us? We’ve got this
variable called the instrument that’s correlated with the outcome. The instrument doesn’t have
a causal effect on the outcome, so this correlation is not necessarily picking a causal effect
of the instrument. It’s got to be picking up the causal effect of a confounder. The
instrument does have a causal effect on the treatment, so we might be picking up the causal
effect of the treatment on the outcome in this correlation here, but it could be something
else. But now the instrument’s randomly assigned and because of that it can’t be correlated
with any other confounders except the treatment. So we’ve ruled out all possible explanations
for the correlation between the instrument and the outcome except one: that there is
a causal effect of the treatment on the outcome and that’s what we are trying to get at, that
is the essence of instrumental variables analysis. [MUSIC].

### 11 thoughts on “The Logic of Instrumental Variables: Causal Inference Bootcamp”

• October 22, 2016 at 12:38 am

great simple explanation!

• December 4, 2016 at 10:36 pm

Thank you so much – Great explanation!

• December 8, 2016 at 11:53 pm

Would you please elaborate the term "confounder", you are frequently using? The material is very helpful.

• January 25, 2017 at 7:59 pm

Thanks so much for making this material and making it public, it has been extremely helpful for my exam review!

• February 26, 2017 at 9:53 pm

I don´t understand why you said that the treatment doesn´t really have an effect on the outcome. It is because it needs an IV to promote an effect?

• March 7, 2017 at 11:09 pm

I've read everywhere else that the IV must be correlated with the independent i.e. treatment variable, yet here you say it should be correlated with the dependent variable? So confused.

• November 15, 2017 at 4:38 pm

This confused me even more. need an example or better explanation.

• November 21, 2018 at 5:29 pm

In the case when IV is positively correlated to Treatment, and Treatment is negatively correlated to Outcome variable, would we still observe correlation between IV and outcome variable?

• January 15, 2019 at 8:08 am

Is the treatment variable assumed to be dichotomous?

• April 27, 2019 at 1:14 pm

This seems to depend upon randomization to evenly distribute other confounders, which isn't always the case. Randomization works on average, but it can fail in any instant