Survival Analysis

Theme Co-ordinators: Bernard Rachet, Bianca De Stavola, Aurelien Belot

Please see here for slides and audio recordings of previous seminars relating to this theme.


Survival analysis is at the core of any study of time to a particular event, such as death, infection, or diagnosis of a particular cancer. It is therefore fundamental to most epidemiological cohort studies, as well as many randomised controlled trials (RCTs).

An important issue in survival analysis is the choice of time scale: this could be for example time since entry into the study (or since first treatment in a RCT), time since a particular event (e.g. the Japanese tsunami), or time since birth (i.e. age).  The latter is particularly relevant for epidemiological studies of chronic diseases, where age often exerts a substantial confounding effect (see [1], Chapter 6, for a discussion of alternative time scales).

Usually not all participants are followed up until they experience the event of interest, leading to their times being ‘censored‘. In this case, the available information consists only of a lower bound for their actual event time. It is typically assumed that the process giving rise to censoring is independent of the process determining time to the event of interest. In contrast to most regression approaches (which typically involve modelling means of distributions given explanatory variables), many survival analysis models are defined in terms of the hazard (or rate) of the event of interest. Within this framework, the hazard is expressed as a function of explanatory variables and an underlying ‘baseline’ hazard. Fully parametric models assume a particular form for the baseline hazard, the simplest being that it is constant over time (Poisson regression). Cox’s proportional hazards model, perhaps the most popular model for survival data, makes no parametric assumptions about the baseline hazard. Both the Poisson and Cox regression models assume the hazards to be proportional for individuals with different values of the explanatory variables. This assumption can be relaxed, for example through use of Aalen’s additive hazard model.

Generalizations to deal with repeated episodes of an event of interest, such as infection, are possible through the introduction of random effects that capture the correleation among events that occur to the same individual. Within the survival analysis literature these are referred to as frailty models. Design and analysis for dependent data

An alternative approach to modelling survival data, more in keeping with most regression techniques, involves modelling the (logarithmically transformed) survival times directly. These are expressed in terms of as a linear function of explanatory variables and an error term, with a choice of distributions for the error terms leading to the family of accelerated failure time models. When the errors are assumed to be exponential, the accelerated failure time model is equivalent to a Poisson regression model.

Most of our applications of survival analysis models involve various flavours of the models mentioned above. However specific issues arise in certain contexts and are of interest to our group. These are discussed below.