20 June 2015

cause and effect

After six days in Michigan (much of which was filtered through a strange, jet-lag- and adrenaline-induced haze), I am now halfway through a two-week workshop at Syracuse University on qualitative and multi-method research. Does the idea of a debate about what constitutes a "causal mechanism" get you hot and bothered? If not, stay far, far away.

Quintessential image connoting causation. (Sorry, I'm not feeling particularly creative.)
While I have publicly joined in complaints about how much we have beat this dead horse over the past week, I am still thinking about it, so maybe there is some life left in it yet. Or maybe I just need to write a little to clear my head. So...

...there's the fundamental problem of causal inference. In order to accurately measure the effect of a treatment on some unit, we would need to make an impossible set of simultaneous observations: observing the unit both when it does and when it does not receive the treatment. We can never actually be certain of the effect of a treatment if we do not observe the counterfactual: what would have happened had the unit not received the treatment? In social science, our units are usually people, or collections thereof (i.e. individuals, villages, countries, etc.), and our "treatments" are things like viewing a campaign ad, receiving a cash transfer, experiencing a certain type of colonial rule, or being "treated" with lax gun control regulations, for example. Our outcomes of interest might be how a person voted, how much economic growth an area experienced, whether a region experienced civil conflict, or a measure of gun-related violence. Obviously, these types of processes are quite different from the ones that natural scientists are interested in observing. The fundamental problem of causal inference becomes a much bigger problem when your units are conscious humans (usually out in the world, not in a lab, with political-psych experiments being the exception) and you don't have an exact replica of the individual or group who is receiving the treatment sitting in another beaker on the lab bench. (Humans in beakers is a weird image. Sorry.)

The way "large-N" researchers attempt to address this problem is to compare across groups of units that are otherwise similar -- either "naturally" (i.e. in natural experiments where proverbial lightning strikes and a population is "randomly" divided in some way, or examining a very narrowly defined population around a temporal or geographic discontinuity), through intentional randomized treatment (in a field or lab experiment), or through statistical techniques such as controlling for pre-treatment factors that are expected to be correlated with the treatment assignment and the outcome, matching units, instrumental variables, selection models, etc. The methods used by these researchers tend to be primarily quantitative.

This past week, I have been learning how "small-N" (or "within-case") researchers attempt to address this problem. On one extreme, within-case researchers essentially reject the proposition that cross-unit comparison allows for causal inference, and instead argue that we must trace processes at a very fine-grained level within units and look for the presence or absence of observable implications of our causal theory. This entails knowing our individual units (i.e. individual political leaders, governments, villages, countries) very well, understanding historical sequences of events and phenomena at a very detailed level, and as part of this, being very conscious of the sources of our observations. The techniques used by these researchers tend to be primarily qualitative, relying on sources such as archived documents and other media, interviews, participant observation, etc.

While the discussion this past week has opened my eyes to many ways in which to effectively combine quantitative and qualitative methods, at other times it has been suggested that because of these different inferential techniques, quantitative large-N and qualitative small-N researchers are coming at this with a different epistemological understanding of what constitutes a causal mechanism. At first I thought that might be true, but now I don't. Some forms of within-case process tracing are built on Bayesian inference (see e.g. Bennett 2006Humphreys and Jacob 2015) or probabilistic assumptions, while others are built on more deterministic assumptions (i.e. in the set theoretic tradition, thinking about factors, or sets of factors, as necessary and/or sufficient conditions for an outcome). Waldner (2012) argues that mechanisms have to be understood as "invariant" causal properties (i.e. "combustion" is what causes a car to move when you step on the gas; that is the mechanism, and combustion as a property exists, whether we observe it or not). At first, the word invariant made me feel weird. It is even farther than many qualitative scholars go, so perhaps it isn't a mainstream view. But, in any case, I found it provocative, and, at first, thought this might reflect a fundamental disconnect between within-case and cross-case (in particular, statistical) researchers in whether the social world is deterministic or probabilistic.

I have been trained (brainwashed?) to think that probabilistic models are the only ones that really make sense. There's unobserved "error" in all of our observations, so why not be honest and build that into our inferential strategy (i.e. assume what we are observing comes from a hypothetical distribution of all the possible outcomes that could have occurred)? However, it seems that, too often, this actually just allows us to get away with mis-specifying models. This is often the critique of qualitatively-oriented scholars when they view statistical analysis that does not seem to be sufficiently informed by a theory of how and why the proposed variables interact in the way they do.

One practical retort to that is that, well, it's better to build in the error than to pretend it isn't there. We are always going to have some forms of error that are essentially random (i.e. unbiased measurement error). If you believe in our ability to identify deterministic mechanisms in social science theory, then you are essentially assuming that you will be a good enough (or lucky enough?) researcher to observe some process with no measurement error or other chance event that could disturb your recording of an outcome into an observation. What if we were to allow this to be true, and a within-case researcher was able to perfectly observe a cause and effect within a single case and, further, to explain it with a causal mechanism (or set of mechanisms) that is "invariant", given a very strict set of conditions or scope. Now what? Don't you need to see if this mechanism applies in other cases that meet these conditions/scope? Because if you admit that this should be the next step, then I believe you are also admitting that variance is possible.

Over the past 24 hours or so, I have revised my view a bit, and I don't think the vision of mechanisms as invariant is necessarily at odds with statistical inference. Because in a perfectly specified regression, we are assuming that beta is the causal effect. Our estimate from a regression (i.e. beta-hat) is a random variable drawn from a distribution with some error separating it from the true causal effect. But regression assumes the true causal effect is unobservable due to a variety of ways in which our perceptions or observations will not be what actually "happened". This is where the modesty (or, one might say, at least the fixed positionality of the observer and the observed within a world that varies) is sort of built in, and where I think small-N research needs to think seriously about how -- even if it is able to present a more theoretically compelling and rigorous picture of the mechanism(s) at work within any observed process -- it is able to address the how the very nature of observation and the observational process is affecting one's conclusions. (And perhaps they are already doing this. In fact, I'm sure they are.) On the other hand, all of the above is based on an simplified characterization of qualitative research. Large-N scholars need to admit that it is still possible for within-case research to get a very good idea of what connects cause to effect in a way that large-N cross-case analysis cannot do without a detailed understanding of history or temporal processes.

There are many other issues that come up in the context of this discussion, other than the matter of deterministic versus probabilistic causal theories. (For example, are you interested in measuring the effect of a specific cause -- i.e. permissive gun control legislation -- or the entire cluster of causes that result in a particular effect -- i.e. gun-related violence?) Overall, I have been really encouraged by the workshop so far. There are a number of very smart people who are paying close attention to these issues, and working on, first, more clearly delineating areas of agreement and disagreement across the (in my view, much exaggerated) "quantitative-qualitative" divide, and second, identifying ways to leverage the relative strengths of various methodological strategies to ensure that our science is progressive and cumulative. There is still substantial debate and disagreement in both of those areas, but hey, acknowledging you have a problem is the first step. Or something.

03 June 2015

it's good to talk to people

Tomorrow, I am leaving Timor-Leste. In addition to the elite interviews I have been conducting in Dili, I also piloted a small door-to-door survey in three districts: Dili, Bobonaro, and Baucau. I am actually quite happy we started with something at a small scale, because there are many kinks to be worked out in both survey design to implementation. In addition, each day was quite slow-going, because I did not have the resources to hire and train multiple enumerators, so it was just me and my translator going door-to-door. 

My interpreter in Maliana, Francisco, and me.
In the villages, homes can be far apart and it often took a long time at each house to introduce ourselves, make people feel comfortable, and identify an adult who is willing to participate in the survey. Most of the questions are multiple choice, but people often opted instead to provide long-winded responses. (Who can blame them? It’s hard to characterize one’s living conditions, or one’s preferences on tax policy, by selecting an item on a Likert scale.) Sometimes, people also offered us a Coke (or my interpreter offered one of the men a cigarette), and they asked questions about how long I had been in Timor and how I was finding the country. It is rude to refuse these offers or conversations, so my target of 20 minutes per household was usually too ambitious.

I only surveyed adults over the age of 25, because my questionnaire contains some items about the period of Indonesian occupation (1975-1999). While I am interested in young people’s impressions of this period based on what they may have heard from their families, I am mainly interested in people’s recollections of their own experiences.

At one house in Baucau this past Saturday, we spoke with a woman in her mid-60s who ran a small kiosk with snacks and drinks. We sat on wood planks propped up around a dirt floor, with a single, moving pile of laundry, live chickens, and a couple of small children off to one side. There were at least three other children, one of whom looked about 20 or 22 and had her own baby she was cradling during the interview. The family lived somewhere in the area, but I couldn’t really tell where they might have slept. (There was likely an attached shed behind the kiosk that was their house.)

The woman was happy to participate in the survey, but for many of the questions about how she’d like to see public revenues managed and spent, she would say “I don’t know, that is for our political leaders to decide.”

Today, a young woman from Baucau called me. I blundered my way through a five-minute phone conversation in Tetun with her, and grasped enough to learn that we had surveyed her mom on Saturday, that she wanted to speak to me about the survey, and that she was also looking for work. I worried that if she wanted a job I was not going to be able to help her, but I asked (in broken Tetun) if my interpreter could call her back. She said yes, so my interpreter returned her call a few minutes later. I found myself nervous about what she would say:  Was she upset about the survey? What if she thought I would be able to help her family in some way? I couldn’t promise to do this. I fretted a bit while I waited to hear back from my interpreter. (In the meantime, I had a nice night out with fellow "expat" friends in Dili where we ate “chicken-on-a-stick”, rice, and fish on the beach for about $2-a-head, surrounded by Timorese clients, and then drove a few minutes down the road for a $6 dessert buffet at Hotel Timor, one of the fanciest hotels in Dili, whose dining room was populated by a few groups consisting almost exclusively of foreigners. Only those with money can flow so easily across this divide.) I just got home to check my email and received my interpreter’s summary of his conversation with the young woman in Baucau. Below, I have omitted any potentially identifiable or sensitive information:
“I phoned Mrs. X today around 14:45 hrs, and heard about her concerns in relation to our survey. Mrs. X is a daughter of the woman we interviewed in a small kiosk where we bought water and cigarettes. She is a secondary school graduate, skilled in use of computer programs, hospitality management… She stated, she has been looking for a job so far, applied to several organizations … but she failed. Sustainability of the family life only depends on [sales from the] kiosk and her mother’s elderly payment [pension] by the government. Their present living conditions are neither good nor bad. Their living conditions now compared to one year ago is the same and she feels it will remain the same in a coming year. She described the overall economic situation of Timor-Leste in general is neither good nor bad…She feels that the present situation is better than the last year, but it will remain the same in a coming year. She described: it is better for our community to pay higher taxes if it means that there will be more services provided by government. She was optimistic about her future at the time of Timor-Leste restoration of independence. If a parliamentary election [were] to be held tomorrow, she will vote for X party.”
This young woman independently phoned me with her own responses to my survey. While I did not leave a copy of the survey with each household, she clearly remembered many of the questions I asked her mom, and wanted to provide her own responses. (Surveys are often not fully confidential in Timor, since other family members will often be gathered around, and respondents note that they feel comfortable this way.)

I will have my interpreter call her back and make sure she is clear about the purpose of my survey and my research (i.e. I hope my research will have long-term benefits for people like her, but unfortunately I cannot provide anything in the short-term). In the meantime, I find myself really wishing I could go back and chat with her more. Earlier, I was concerned that perhaps the population of Timor-Leste would be over-surveyed -- with donors doing impact evaluation surveys and large foundations conducting public opinion polls in such a small country, surely people were sick of talking to random strangers that wander into their neighborhood, gather information about their living standards and such, then leave, never to be seen again? In a way, it was really nice to hear from this woman, as some reassurance that perhaps some people are not being asked for their views enough...

This, and some other experiences, have made me think a lot about social science research ethics these days. I will share more once my thinking on the subject is a bit more coherent. For now, all I can say is: it's complicated, and we (or I, for one) definitely need more training on these issues.