![]() |
Quintessential image connoting causation. (Sorry, I'm not feeling particularly creative.) |
...there's the fundamental problem of causal inference. In order to accurately measure the effect of a treatment on some unit, we would need to make an impossible set of simultaneous observations: observing the unit both when it does and when it does not receive the treatment. We can never actually be certain of the effect of a treatment if we do not observe the counterfactual: what would have happened had the unit not received the treatment? In social science, our units are usually people, or collections thereof (i.e. individuals, villages, countries, etc.), and our "treatments" are things like viewing a campaign ad, receiving a cash transfer, experiencing a certain type of colonial rule, or being "treated" with lax gun control regulations, for example. Our outcomes of interest might be how a person voted, how much economic growth an area experienced, whether a region experienced civil conflict, or a measure of gun-related violence. Obviously, these types of processes are quite different from the ones that natural scientists are interested in observing. The fundamental problem of causal inference becomes a much bigger problem when your units are conscious humans (usually out in the world, not in a lab, with political-psych experiments being the exception) and you don't have an exact replica of the individual or group who is receiving the treatment sitting in another beaker on the lab bench. (Humans in beakers is a weird image. Sorry.)
The way "large-N" researchers attempt to address this problem is to compare across groups of units that are otherwise similar -- either "naturally" (i.e. in natural experiments where proverbial lightning strikes and a population is "randomly" divided in some way, or examining a very narrowly defined population around a temporal or geographic discontinuity), through intentional randomized treatment (in a field or lab experiment), or through statistical techniques such as controlling for pre-treatment factors that are expected to be correlated with the treatment assignment and the outcome, matching units, instrumental variables, selection models, etc. The methods used by these researchers tend to be primarily quantitative.
This past week, I have been learning how "small-N" (or "within-case") researchers attempt to address this problem. On one extreme, within-case researchers essentially reject the proposition that cross-unit comparison allows for causal inference, and instead argue that we must trace processes at a very fine-grained level within units and look for the presence or absence of observable implications of our causal theory. This entails knowing our individual units (i.e. individual political leaders, governments, villages, countries) very well, understanding historical sequences of events and phenomena at a very detailed level, and as part of this, being very conscious of the sources of our observations. The techniques used by these researchers tend to be primarily qualitative, relying on sources such as archived documents and other media, interviews, participant observation, etc.
While the discussion this past week has opened my eyes to many ways in which to effectively combine quantitative and qualitative methods, at other times it has been suggested that because of these different inferential techniques, quantitative large-N and qualitative small-N researchers are coming at this with a different epistemological understanding of what constitutes a causal mechanism. At first I thought that might be true, but now I don't. Some forms of within-case process tracing are built on Bayesian inference (see e.g. Bennett 2006, Humphreys and Jacob 2015) or probabilistic assumptions, while others are built on more deterministic assumptions (i.e. in the set theoretic tradition, thinking about factors, or sets of factors, as necessary and/or sufficient conditions for an outcome). Waldner (2012) argues that mechanisms have to be understood as "invariant" causal properties (i.e. "combustion" is what causes a car to move when you step on the gas; that is the mechanism, and combustion as a property exists, whether we observe it or not). At first, the word invariant made me feel weird. It is even farther than many qualitative scholars go, so perhaps it isn't a mainstream view. But, in any case, I found it provocative, and, at first, thought this might reflect a fundamental disconnect between within-case and cross-case (in particular, statistical) researchers in whether the social world is deterministic or probabilistic.
I have been trained (brainwashed?) to think that probabilistic models are the only ones that really make sense. There's unobserved "error" in all of our observations, so why not be honest and build that into our inferential strategy (i.e. assume what we are observing comes from a hypothetical distribution of all the possible outcomes that could have occurred)? However, it seems that, too often, this actually just allows us to get away with mis-specifying models. This is often the critique of qualitatively-oriented scholars when they view statistical analysis that does not seem to be sufficiently informed by a theory of how and why the proposed variables interact in the way they do.
One practical retort to that is that, well, it's better to build in the error than to pretend it isn't there. We are always going to have some forms of error that are essentially random (i.e. unbiased measurement error). If you believe in our ability to identify deterministic mechanisms in social science theory, then you are essentially assuming that you will be a good enough (or lucky enough?) researcher to observe some process with no measurement error or other chance event that could disturb your recording of an outcome into an observation. What if we were to allow this to be true, and a within-case researcher was able to perfectly observe a cause and effect within a single case and, further, to explain it with a causal mechanism (or set of mechanisms) that is "invariant", given a very strict set of conditions or scope. Now what? Don't you need to see if this mechanism applies in other cases that meet these conditions/scope? Because if you admit that this should be the next step, then I believe you are also admitting that variance is possible.
Over the past 24 hours or so, I have revised my view a bit, and I don't think the vision of mechanisms as invariant is necessarily at odds with statistical inference. Because in a perfectly specified regression, we are assuming that beta is the causal effect. Our estimate from a regression (i.e. beta-hat) is a random variable drawn from a distribution with some error separating it from the true causal effect. But regression assumes the true causal effect is unobservable due to a variety of ways in which our perceptions or observations will not be what actually "happened". This is where the modesty (or, one might say, at least the fixed positionality of the observer and the observed within a world that varies) is sort of built in, and where I think small-N research needs to think seriously about how -- even if it is able to present a more theoretically compelling and rigorous picture of the mechanism(s) at work within any observed process -- it is able to address the how the very nature of observation and the observational process is affecting one's conclusions. (And perhaps they are already doing this. In fact, I'm sure they are.) On the other hand, all of the above is based on an simplified characterization of qualitative research. Large-N scholars need to admit that it is still possible for within-case research to get a very good idea of what connects cause to effect in a way that large-N cross-case analysis cannot do without a detailed understanding of history or temporal processes.
There are many other issues that come up in the context of this discussion, other than the matter of deterministic versus probabilistic causal theories. (For example, are you interested in measuring the effect of a specific cause -- i.e. permissive gun control legislation -- or the entire cluster of causes that result in a particular effect -- i.e. gun-related violence?) Overall, I have been really encouraged by the workshop so far. There are a number of very smart people who are paying close attention to these issues, and working on, first, more clearly delineating areas of agreement and disagreement across the (in my view, much exaggerated) "quantitative-qualitative" divide, and second, identifying ways to leverage the relative strengths of various methodological strategies to ensure that our science is progressive and cumulative. There is still substantial debate and disagreement in both of those areas, but hey, acknowledging you have a problem is the first step. Or something.