“Comparing apples and oranges is the only endeavor worthy of true scientists; comparing apples to apples is trivial.” Gene V Glass, Arizona State University
Causal inference is a central aim of most medical and epidemiological investigation. We would like to know ‘does this treatment work?’ or ‘is that exposure harmful?’, and if so ‘to what extent?’.
The gold standard approach to answering such questions is to conduct a controlled experiment in which treatments/exposures are allocated at random, all subjects are perfectly compliant, and all the relevant data are collected and measured without error. Provided that we can then discount ‘chance’ alone as an explanation, any observed effects can be interpreted as causal.
In the real world, however, such experiments rarely attain this ideal status, and for most important questions, such an experiment would not even be ethically, practically, or economically feasible; in these situations, causal inference must be based instead on observational data. Despite this, the role of statistics is often seen as quantifying the extent to which ‘chance’ could explain the results, with concerns over systematic biases due to the non-ideal nature of the data relegated to the qualitative discussion of the results.
Over the last thirty years, however, a formal statistical language has been developed in which causal effects can be unambiguously defined, and the assumptions needed for their estimation clearly stated. This clarity has led to increased awareness of causal pitfalls (such as the ‘birthweight paradox’ – see Hernández-Díaz et al, 2006) and the building of a new and extensive toolbox of statistical methods especially designed for making causal inferences from non-ideal data under transparent, less restrictive and more plausible assumptions than were hitherto required. Of course this does not mean that all causal questions can be answered, but at least they can be formally addressed in a quantitative fashion.
Considerations of causality are not new. Neyman used potential outcomes (corner stones of this ‘new’ causal language – see Rubin, 1978) in his PhD thesis in the 1920s, and who could forget Bradford Hill’s much-cited guidelines published in 1965? The last few decades, however, have seen the focus move towards developing solutions, as well as acknowledging limitations. Indeed, not all reliable causal inference requires novel methodology. As Philip Dawid once said “a causal model is just an ambitious associational model”. A carefully-considered regression model, with an appropriate set of potential confounders (possibly identified using a causal diagram – see below) measured and included as covariates, is the most appropriate causal model in many simple settings.
But how do we decide whether such an approach is suitable? An ubiquitous feature of methods for estimating causal effects from non-ideal data is the need for untestable assumptions regarding the causal structure of the variables being analysed (such as ‘there are no common causes of A and B’, or ‘Z is an instrumental variable’ – see below). Such assumptions are often represented in a causal diagram or graph, with variables identified by nodes and the relationships between them by edges. The simplest and most commonly-used class of causal diagram is the directed acyclic graph (DAG), in which all edges are arrows, and there are no cycles, i.e. no variable explains itself (Greenland et al, 1999). These are used not only to represent assumptions but also to inform the choice of a causally-interpretable analysis.
Another common feature of causal inference methods is that, as we move further from the ideal experimental setting, more aspects of the joint distribution of the variables must be modelled, which would have been ancillary had the data arisen from a perfect experiment. Structural equation modelling (SEM) (Kline, 2011) is a fully-parametric approach, in which the relationship between each node in the graph and its parents is specified parametrically. This approach offers a very elegant treatment of measurement error when this affects any variable for which validation or replication data are available. The true variable is included in the graph as a latent (unobserved) variable and the joint distribution of manifest and latent variables is estimated within a single likelihood framework. Missing values can be similarly dealt with within the same framework by including missing value indicators for which specific mechanisms are specified.
Concerns over the potential impact of model misspecification in the SEM approach have led to the development of alternative semiparametric approaches to causal inference, in which the number of additional aspects to be modelled is reduced. These include methods based on inverse probability weighting, g-estimation, and the so-called doubly-robust estimation proposed by Robins, Rotnitzky and others.
These newer causal inference methods are particularly relevant for studying the causal effect of a time-varying exposure on an outcome, because standard methods fail to give causally-interpretable estimators when there exist time-varying confounders of the exposure and outcome that are themselves affected by previous levels of the exposure. Methods developed to deal with this problem include the fully-parametric g-computation formula (Robins, 1986), and two semiparametric approaches: g-estimation of structural nested models (Robins et al, 1992), and inverse probability weighted estimation of marginal structural models (Robins et al, 2000). Related to this longitudinal setting is the identification of optimal treatment regimes, for example in HIV/AIDS research where questions such as ‘at what level of CD4 should HAART (highly active antiretroviral therapy) be initiated?’ are often asked. These can be addressed using the methods listed above, and other related methods (see Moodie et al, 2007, for a review).
It is important to appreciate that non-ideal experimental data (e.g. suffering from noncompliance, missing data or measurement error) are not on a par with data arising from observational studies (as may be inferred from what is written above). Randomisation can be used as a tool to aid causal inference even when the randomised experiment is ‘broken’, for example as a result of non-compliance to randomised treatment. Such methods make use of randomisation as an instrumental variable (Angrist and Pischke, 2009). Instrumental variables have even been used with observational data, in particular when the instrument is a variable that holds genetic information (in which case it is known as Mendelian randomisation; see Davey-Smith and Ebrahim, 2003) with genotype used in place of randomisation. This is motivated by the idea that genes are ‘randomly’ passed down from parents to offspring in the same way that treatment is allocated in double-blind randomised trials. Although this assumption is generally untestable (Hernán and Robins, 2006), there are situations in which it may be deemed more plausible than the other candidate set of untestable assumptions, namely that of ‘no unmeasured confounding’.
Approaches (such as SEM) amenable to complex causal structures have opened the way to looking beyond the causal effect of an exposure on an outcome as a black box, and to asking ‘how does this exposure act?’. For example, if income has a positive effect on certain health outcomes, does this act simply by increasing access to health care, or are there other important pathways? Addressing such questions is the goal of mediation analysis and the estimation of direct/indirect effects (see Ten Have and Joffe, in press, for a review). This area has seen an explosion of new methodology in recent years, with several semiparametric alternatives to SEM introduced.
In conclusion, causal inference is an important, exciting and fast-moving area of methodological research. The discussion above gives an overview of some of the topics that exist beneath its ever-growing umbrella, but of course, there are many more.
Causal Inference at LSHTM
Most statisticians and epidemiologists at the School are engaged in causal inference. In the interest of space, we include here only the names of those with a particular interest in methodological issues relating to the topics discussed above.
Jonathan Bartlett; James Carpenter; Simon Cousens; Rhian Daniel; Bianca De Stavola; Frank Dudbridge; Chris Frost; Richard Grieve; Mike Kenward; Noemi Kreif; Neil Pearce; Costanza Pizzi; George Ploubidis; Rosalba Radice; Zia Sadique; Anders Skrondal (honorary); Stijn Vansteelandt (honorary); Michael Wallace; Symon Wandiembe
Two short courses relating to causal inference are run each year at LSHTM. One is entitled Causal Inference in Epidemiology: recent methodological developments and runs for one week each November; the other is a three-day course in February, entitled Factor Analysis and Structural Equation Modelling: an introduction using Stata and MPlus.
The causal inference discussion group at LSHTM meets once or twice a month, with sessions usually taking the form of a seminar followed by extended discussion. Past speakers include Philip Dawid, Vanessa Didelez, Richard Emsley, Miguel Hernán, Erica Moodie and Anders Skrondal.
Details of upcoming meetings can also be found here.
Suggested introductory reading
Angrist JD, Pischke J (2009) Mostly harmless econometrics: an empiricist’s companion. Princeton University Press.
Greenland S, Pearl J, Robins JM (1999) Causal diagrams for epidemiologic research. Epidemiology. 10(1):37–48.
Greenland S, Brumback B (2002) An overview of relations among causal modelling methods. International Journal of Epidemiology. 31:1030–1037.
Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA (2002) Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. American Journal of Epidemiology. 155:176–184.