Measurement error and misclassification
The measurement of variables of interest is central to epidemiological study. Often, the measurements we obtain are noisy error-prone versions of the underlying quantity of primary interest. Such errors can arise due to technical error induced by imperfect measurement instruments and short-term fluctuations over time. An example is a single measurement of blood pressure, considered as a measure of an individual’s underlying average blood pressure. Variables obtained by asking individuals to answer questions about their behaviour or characteristics are also often subject to error, either due to the individual’s inability to accurately recall the behaviour in question or a tendency, for whatever reason, to over-estimate or under-estimate the quantity being requested.
The consequences of measurement error in a variable depend on the variable’s role in the substantive model of interest (Carroll et al). For example, independent error in the continuous outcome variable in a linear regression does not cause bias. In contrast, measurement error in the explanatory variables of regression models does cause bias, in general. Measurement error in an exposure of interest may distort estimates of the exposures effect on the outcome of interest, while error in confounders will lead to imperfect adjustment for confounding, leading to biased estimates of the effect of an exposure.
When explanatory variables in regression models are categorical the analogy of measurement error is misclassification. Unlike measurement errors, which can often plausibly be assumed to be independent of underlying true levels, a misclassification error is never independent of the underlying value of the predictor variable and so different theory covers the effects of misclassification and measurement errors (White et al).
Over the past thirty years a vast array of methods has been developed to accommodate measurement errors and misclassification in statistical analysis models. While simple methods include method of moments correction and regression calibration have sometimes been applied in epidemiological research, more sophisticated approaches, such as maximum likelihood (Bartlett et al) and semi-parametric methods (Carroll et al), have received less attention. This is likely partly due to a relative scarcity of implementation in statistical software packages.
Areas for future research efforts
Greater recognition of the effects of measurement error and misclassification in the analysis of epidemiological and clinical studies.
Increasing the accessibility of methods to deal with measurement error, through dissemination of methods and the implementation of methods into statistical software.
Development of methods that allow for the effects of measurement errors in causal models that describe how risk factors, and therefore risks of disease, change over time.
Bartlett J. W., De Stavola B. L., Frost C. (2009). Linear mixed models for replication data to efficiently allow for covariate measurement error. Statistics in Medicine; 28: 3158-3178.
Carroll R. J., Ruppert D., Stefanski L. A., Crainiceanu C. M. (2006). Measurement error in nonlinear models. Chapman & Hall/CRC, Boca Raton, FL, US.
Frost C., Thompson S. G. (2000). Correcting for regression dilution bias: comparison of methods for a single predictor variable. Journal of the Royal Statistical Society A; 163: 173-189.
Frost C., White I. R. (2005). The effect of measurement error in risk factors that change over time in cohort studies: do simple methods overcorrect for `regression dilution’?. International Journal of Epidemiology; 34: 1359-1368.
Gustafson, P. (2003). Measurement Error and Misclassification in Statistics and Epidemiology: Impacts and Bayesian Adjustments. Chapman and Hall/CRC Press.
White I., Frost C., Tokunaga S. (2001). Correcting for measurement error in binary and continuous variables using replicates. Statistics in Medicine; 20:3441-3457
Knuiman M. W., Divitini M. L., Buzas J. S., Fitzgerald P. E. B. (1998). Adjustment for regression dilution in epidemiological regression analyses. Annals of Epidemiology; 8: 56-63.