# In The Armchair

## Causality and Karl Popper

Posted in Armchair Ruminations by Armchair Guy on May 14, 2010

I learned today of some absolutely fascinating interactions between Popperian critical rationalism and the theory of causality.  The counterfactual theory of causality is to scientists probably the most useful of the causal theories.  The way in which Popperian philosophy enters is in the meaning of counterfactuals and whether they are related by a so-called Structural Equation Model (SEM).

Counterfactuals

To illustrate, suppose we observe and are interested in three variables, $X, Y, Z$. We know that $X$ comes before $Y$ and $Z$ chronologically, and $Y$ comes before $Z$. Let’s talk about counterfactuals first. We observe $Z$, but suppose we now ask ourselves what would have happened to $Z$ if, instead of letting nature calculate $Y$, we intervened to set it to some value such as $0$. We denote this new, imagined variable by $Z_{Y = 0}$. We might also be interested in the counterfactual variable $Z_{X=0}$, which is the new variable that “would result” if we could intervene to set $X$ to 0. Counterfactual theories assume that variables such as $Z_{X=0}$ “exist” and are “available to nature”, and that if $X = 0$ actually happens, then nature responds by producing this version of $Z$. If, instead, $X = 1$ occurs, then nature produces the entirely different variable $Z_{X=1}$.

Structural Equations

Let us try to reconcile the existence of such counterfactual variables with common views of determinism. If we tried to imagine a process by which nature “calculates” the variables $X, Y, Z$, we’d be tempted to come up with something like this: $X = f_X(\epsilon_X), Y = f_Y(X, \epsilon_Y), Z = f_Z(X, Y, \epsilon_Z)$. Here $\epsilon_X, \epsilon_Y, \epsilon_Z$ are three independent random variables (often called “noise”). So nature first calculates $X$ from some noise variable using a secret process $f_X$ (just a function, really, but unknown to us). Then it calculates $Y$ from the previously calculated value of $Y$ and the noise variable $\epsilon_Y$ using the function $f_Y$, and finally calculates $Z$ from $X, Y, \epsilon_Z$ using $f_Z$.

Such SEMs are very useful. They can be represented graphically and analyzed mathematically to answer a great many questions in causality, leading to a rich theory developed by Judea Pearl, among many others. An additional bonus is these models account for counterfactual variables in a very natural way. Suppose, for example, that we are interested in the counterfactual $Z_{Y=0}$, that is the version of $Z$ that results if someone intervened and set the variable $Y$ to $0$. (The mechanism by which this might be done is not relevant here.) Nature then simply computes the function $f_Z(X, 0, \epsilon_Z)$. Another example is the counterfactual $Z_{X=0}$, which means that $Y_{X = 0}$ is first computed as $Y_{X=0} = f_Y(0, \epsilon_Y)$, and then $Z_{X=0} = f_Z(0, Y_{X=0}, \epsilon_Z)$. The counterfactual $Z_{X=0, Y=1}$ is naturally defined as $f_Z(0, 1,\epsilon_Z)$. Thus counterfactuals have very natural semantics in the SEM setting.

The SEM is also a very natural model to humans. Much of what we perceive as Newtonian physics and determinism works this way. Event $X$ happens, this influences event $Y$ (but doesn’t determine it perfectly, which is why $\epsilon_Y$ is necessary), and so on. Indeed, when I ask myself what other models of the universe might exist (and this question excludes quantum uncertainty and other similar weirdness), I am unable to conceive of a process which doesn’t boil down to an SEM of the type shown above. Most people, when asked to imagine a process by which the universe “creates” events, will probably come up with an SEM.

Perhaps unsurprisingly, this kind of model is inadmissible in the Popperian view, since it uses an underlying justificiation. Intuitively, the structural equations justify the counterfactuals. Delving into things a little more deeply, it turns out that this simple model makes a large number of assumptions that are not verifiable, and so are inadmissible in the Popperian view. But we need a specific example.

A Weird Counterfactual

Specifically, consider the “cross-world counterfactual” $f_Z(1, Y_{X=0}, \epsilon_Z)$. Mathematically, there is no problem with this definition; the function $f_Z$ is evaluated at $1, Y_{X=0}, \epsilon_Z$ which are themselves well-defined constants or random variables. The problem is with interpretation and observability. Any of the previous counterfactuals could be obtained in a natural way. To observe $Z_{X = 0}$, set $X$ to 0, let nature determine $Y$ and observe the resulting $Z$. To observe $Z_{X=0, Y=1}$, set both $X$ to $0$ and $Y$ to $1$ and observe the corresponding $Z$. But the new quantity $f_Z(1, Y_{X=0}, \epsilon_Z)$ is fundamentally different; it involves both $X=1$ (the first argument of $f_Z$) and $X=0$ needed to observe $Y_{X=0}$. This means what we’re trying to do is the following complicated procedure: first intervene and set $X = 0$, obtain the random variable $Y_{X=0}$, and then turn back the clock, and now intervene to set $X=1$ as well as set $Y = Y_{X=0}$, using the $Y_{X=0}$ observed before turning back the clock.

Why Critical Rationalism Excludes SEMs

But of course, we can’t really turn back clocks. Thus the weird counterfactual above is unobservable; there should be no way for us to evaluate it under Popperian critical rationalism. If we can evaluate it, it means we are using a rich justification with implications that cannot be verified experimentally. Now, under the SEM it can be shown that

$E[f_Z(1, Y_{X=0}, \epsilon_Z)] = \sum_y E[Z|X=1, Y=y] P[Y=y|X=0]$

Since all of the quantities on the right hand side are based on observed variables, it means this cross-world counterfactual, which we “couldn’t possibly identify” without turning back the clock, is actually identifiable — a paradox!

What’s the resolution to this paradox? Simply that the SEM makes so many hidden assumptions that we can actually identify it. This explains why SEMs don’t conform to Popperian thought, and why a lot of work focuses on causality and counterfactuals without structural equations.

### 2 Responses

1. Rasmus said, on June 19, 2011 at 9:05 am

I might just have skimmed through your post, and if so I apologize, but it seems to me that you confuse Popperianism with logical positivism. Popper’s emphasis wasn’t on verifiability, but on falsifiability. In my view, structural equation modeling actually fits very well into a Popperian framework, in that falsifiability works very well together with counterfactuals. A Popperian might say: “If our supposed model A is correct, the observed data should look something like X. (A->X). If the data does indeed look like X, that does not “prove” that our theory is correct (that would be affirming the consequent), since there are a multitude of other models (B, C, D…) that could also generate the same dataset. But if the data doesn’t look like X, that proves that A isn’t correct (since A->X ~X->~A). Hence, SEMs could potentially be used to test or invalidate hypothesised models in a very fallibilist fashion.
SEMs do indeed make a lot of “hidden assumptions” but it is not a Popperian vice that we can’t observe or identify these hidden assumptions. It is rather a Popperian virtue that we can show them to be false, by counterfactual reasoning. (If X would be the case, Y would be the case, so if Y is not the case, then X cannot be the case).

The logical positivits had an issue with counterfactuals (treating counterfactuals as a metaphysical entity) but I’m not so sure Popper would.

2. Armchair Guy said, on June 20, 2011 at 9:30 pm

Rasmus,

Thanks for your comment. I agree with both you and Popper that falsifiability is a better criterion for a scientific theory than verifiability. I think I was using the word verify colloquially rather than in the same sense that Popper and other philosophers of science used it; what I really meant is falsifiability.

The point I was trying to make is that even a simple 3-variable structural equation model makes predictions/has consequences that are not falsifiable. Specifically,

$E[f_Z(1, Y_{X=0}, \epsilon_Z)] = \sum_y E[Z|X=1, Y=y] P[Y=y|X=0]$

is not a falsifiable statement, yet it follows from the SEM.