# Measurement Error in Epidemiology

1.1. Basic concepts of measurement error

In epidemiology, the variables of interest are often measured with error. This is true not only for variables that are self-reported, such as lifestyle behavior, but also for variables derived from laboratory tests, such as serum cholesterol.

When one wants to link an outcome variable, $Y\text{,}$ with an exposure variable, $X\text{,}$ a statistical model is postulated relating the outcome to the true exposure and other covariates, $Z\text{.}$ This is called the outcome model.

When the exposure variable is measured with error, denoted by $X^*\text{,}$ the error is termed non-differential if it provides no extra information about the outcome over and above the information provided by the true exposure and other covariates that are included in the outcome model. Another, more statistical, way of expressing this is that $Y$ is conditionally independent of $X^*$ given $X$ and $Z\text{.}$

Alternatively, the error in measured exposure can be differential, meaning the degree or direction of error is related to the outcome. Differential errors are more difficult to deal with, but in prospective studies it may be reasonable to assume the error is non-differential. However, differential error may occur in case-control studies involving self-reported exposures in the guise of recall bias.

Errors may also occur in outcome measurement. For example, when comparing a reported outcome across different socio-economic status (SES) groups, it is important to know whether the type and level of misreporting that occurs is similar in each SES group. If the misreporting is the same in each group then the error is non-differential. If the error in measured outcome differs by SES group, then the error is differential and bias is introduced into the comparison. The effects of measurement error in an outcome variable are understudied relative to error in exposures, but there is a growing recognition of their potential impact.

This primer only considers non-differential error in measurement of continuous variables, primarily exposure variables. Categorical variables can also be measured with error, but such error is known as misclassification. A book by Gustafson (2003) provides an in-depth discussion of misclassification.

The type and magnitude of error in a measurement of a continuous variable, or the relationship between $X$ and $X^*\text{,}$ is described by the measurement error model. Three models typically occur in epidemiologic work (although there are a multitude of variations). They are the classical measurement error model, the linear measurement error model, and the Berkson measurement error model.

The classical measurement error model is simple, describes a measurement that has no systematic bias but whose values are subject to random error, and is defined by

$${{X}^{*}}=X+e$$

where $e$ is a random variable with mean zero and is independent of $X$ (Carroll et al, 2006: Chapter 1). Such errors are assumed frequently, although not universally, in laboratory and objective clinical measurements, for example, when measuring serum cholesterol (Law et al, 1994) or blood pressure (MacMahon et al, 1990).

The linear measurement error model is an extension of the classical model that is more suitable for some measurements, particularly self-reports, in which the true value of the variable of interest, $X\text{,}$ and its error prone measurement, $X^*\text{,}$ are related by

$$X^* = \alpha_0 + \alpha_X X + e$$

where $e$ is a random variable with mean zero and is independent of $X$ (Cochran, 1968). This model describes a situation where the observed measurement includes both random error and systematic bias, allowing the latter to depend on the true value, $X\text{.}$ Classical error is included as a special case of this more general model, occurring when $\alpha_0 = 0$ and $\alpha_X = 1\text{.}$ In this model, $\alpha_0$ can be said to quantify location bias (bias independent of the value of $X$) and $\alpha_X$ quantifies the scale bias (bias that depends proportionally on the value of $X$). Further extensions of the linear measurement error model allow $X^*$ to also depend on other variables. Other extensions are to allow the variance of $e$ to depend on $X$ or $e$ in repeat measurements of $X^*$ to be correlated (Carroll et al, 2006: Section 4.7).

The Berkson measurement error model is as simple as the classical model, but an “inverse” version of it. In some circumstances, it is appropriate to view the true value, $X\text{,}$ as arising from the measured value, $X^*\text{,}$ together with an error, $e\text{,}$ that is independent of $X^*\text{.}$ In that case the error model should be written:

$$X = X^* + e$$

where $e$ is a random variable with mean zero and is independent of $X^*\text{.}$ This happens, for example, when all the individuals in specific subgroups are assigned the average value of their subgroup, as often occurs with exposure measurements in occupational epidemiology (Armstrong, 1998). Another common example of Berkson error is when the measurement, $X^*\text{,}$ is a score derived from a prediction equation based on a regression model. The effects of Berkson error on the results of statistical analyses are in several respects quite different from those of classical error and linear measurement error (Carroll et al, 2006, Chapter 1).

Validation studies are conducted to estimate the parameters of the measurement error model. They usually require measurement of the true value of the variable, known as the reference. If the true value cannot be ascertained, then a measurement unbiased at the individual level may be used as the reference in its place, although this measurement must be repeated within individuals at a sufficiently distant time to assess the magnitude of random error in the unbiased measurement.

A measurement is unbiased at the individual level only if the expected value of the measured exposure over repeated measurements for a certain individual, $X^*_{ij}\text{,}$ is equal to the true exposure of that individual. We write this condition as $E\left(X^{*}_{ij}|i\right)=X_i\text{,}$ where $E$ denotes expectation. If a measurement is unbiased at the individual level, then it is always unbiased at the population level (but not vice versa).

Measurements are unbiased at the population level if the expected value of the measurement is equal to the population mean. In statistical notation, if $X^*_{ij}$ are the repeat measures of individual $i\text{,}$ and $X_i$ is the true exposure of individual $i\text{,}$ then $X^*$ is unbiased at the population level if $E(X^*_{ij})=E(X_i) \text{.}$

If $X^*$ satisfies the classical measurement error model, then it is unbiased at the individual level. If $X^*$ satisfies the linear measurement error model, but not the classical model, then it is biased at the individual level. If $X^*$ satisfies the Berkson measurement error model, then it is biased at the individual level, but unbiased at the population level.

Studies that include a single unbiased measurement but omit repeated reference measurements can still provide useful information but cannot estimate all the parameters of the measurement error model. They are sometimes called calibration studies instead of validation studies.

A reproducibility study only collects repeat measurements of $X^*\text{.}$ Such a study can be a validation study only if $X^*$ has classical measurement error. The parameters of the model may be estimated from repeated applications of the error-prone measurement, $X^*\text{,}$ within individuals, and no measurements of the true value, $X\text{,}$ are then required. A reproducibility study cannot be used to estimate the systematic bias that is assumed with other models, such as the linear measurement error model, because the same systematic bias will be present in each repeated measurement.

A validation study may be nested within an epidemiologic study. For example, a subgroup of participants in a cohort study may be asked to provide not only the error-prone measurement of exposure but also a true value through additional data collection. In this case, the study is called an internal validation study.

Validation studies that are conducted on a group of individuals not participating in a main study are called external validation studies. External validation studies are less reliable than internal ones for determining the parameters of the measurement error model, since the estimation involves an assumption of transportability between the group of participants in the validation study and the group participating in the main study.

The issue of transportability of a measurement error model is a delicate matter (Carroll et al, 2006, Sections 2.2.4-2.2.5). Essentially, there are some parameters of a measurement error model that may be quite robust to different settings, while others may vary greatly with setting. For example, if a measured exposure, $X^*\text{,}$ has classical measurement error in one study, then this may very well be true in another study, and the variance of the random errors may be similar in the two studies. However, it is important to be aware that the variance of the true exposure, $X\text{,}$ may differ greatly between the two studies and the consequences of such a difference need to be carefully considered.

If there is a big difference between the variances of $X\text{,}$ then this will make the calibration equation that is derived from the validation study unsuitable for the study of interest. One can see this clearly in the simplest case of normally distributed $X^*$ having normally distributed classical measurement error. In this case the linear calibration equation of $X$ on $X^*$ derived from an external validation study will have slope $\text{var}(X)/(\text{var}(X) + \text{var}(e))$, where $\text{var}(X)$ is the variance of $X$ in the validation study population. However, if the main study population’s variance of $X$ were different, then this calibration equation slope obtained from the validation study will be unsuitable for applying adjustment to main study population inferences.

1.2. Basic concepts of usual exposure

Many exposures vary with time. For example, the air pollution one is exposed to varies throughout the day and from day-to-day. Biological entities, such as serum cholesterol levels, also vary throughout the day and from day-to-day. Exploring relationships between such exposures and an outcome are made more complex by this variation over time. For outcomes that are thought to be influenced by exposures over the long-term, epidemiologists have studied the relationship of the outcome with usual exposure, defining this as the average long-term exposure.

Since exposure measurements on an individual are rarely collected over an extended period of time, the long-term average is almost always unknown, and the finite number of shorter-term measurements (often only one!) that are available must then be used to estimate the exposure. Therefore, even when the measurement of the instantaneous exposure is exact, the average of such measurements must still be regarded as an error-prone measurement of the usual exposure.

Sometimes, exact (or reasonably precise) instantaneous measurements that are made on an individual are assumed to vary randomly around the individual’s usual exposure, and when they are made sufficiently far apart in time, the deviations are assumed to be independent. In this case, the classical measurement error model is used to describe the relationship between measurement of instantaneous exposure and usual exposure.

In situations where serial exposure measurements are available on all participants at regular intervals during follow-up and interest is in the relationship between exposure and a later outcome, some investigators have advocated taking account of the time of exposure in this relationship.

In seminal work, McMahon et al (1990) used serial measurements of blood pressure and cholesterol as if they were repeat measurements of an underlying unobserved true average value, and the measurements conformed to a classical measurement error model. On this basis they applied a measurement error adjustment to estimates of relative risk for stroke and coronary heart disease. Later, Frost and White (2005) noted that this approach ignores the relationship over time between these measures and disease risk. Wang et al (2016) proposed a method similar to that of Frost and White. The most appropriate manner of dealing with serial error-prone measurements in a longitudinal setting has not yet been fully resolved, although Boshuizen et al (2007) present a method that has considerable promise.

1.3. Impact of measurement error on research studies

Measurement error can often have an impact on the results of research studies. The nature and magnitude of the impact will depend on the type of error (as defined by the measurement error model), the size of error (especially, but not always, the ratio of the error variance to the variance of the true exposure), and the quantity that is targeted for estimation. If the measurement error model and its parameters are known or can be estimated from validation studies, then these impacts can be quantified.

## Impact on studies evaluating the association of an exposure with an outcome when the exposure is measured with error

In etiologic studies where the focus is on an association such as risk difference, relative risk, odds ratio, or hazard ratio, and the exposure is measured with error, two problems may occur:

1. Bias in the target estimate. This bias is sometimes, but not always, towards the null value and in such a case is called attenuation or dilution.

2. Loss of statistical power for detecting an exposure effect. This means that because of the measurement error the researcher is in greater danger of failing to find an important relationship between exposure and outcome.

In simple situations when there is only a single exposure that is measured with classical or linear measurement error, the attenuation factor or regression dilution factor is the multiplicative factor by which the regression coefficient linking exposure to outcome is attenuated due to the measurement error in the exposure variable. If the measurement error model is Berkson, then there is no bias in the estimated risk parameter.

### In more detail: Attenuation factors

Suppose our analysis of the relationship between a continuous outcome, $Y\text{,}$ and an explanatory variable, $X\text{,}$ is based on a linear regression model

$$E(Y|X) = \beta_0 + \beta_X X.$$

However, because of measurement problems we use $X^*$ instead of $X$ and therefore explore the linear regression

$$E(Y|X^*) = \beta_{0^*} + \beta_{X^*} X.$$

When the measurement error model is classical, then $| \beta_{X^*} | \le | \beta_X |\text{,}$ with equality occurring only when $\beta_{X} = 0$. More precisely we can write

$$\beta_{X^*} = \frac{ \text{cov}(Y,X^*) }{ \text{var}(X^*)}=\frac{ \text{cov}(Y,X+e) }{ \text{var}(X+e)} =\frac{ \text{cov}(Y,X) }{ \text{var}(X)+ \text{var} (e)} = \frac{ \text{var}(X) }{ \text{var}(X+e)} \frac{ \text{cov}(Y,X) }{ \text{var}(X)} = \lambda \beta_X$$

where $\lambda = \frac{ \text{var}(X) }{ \text{var}(X)+ \text{var} (e)}$ lies between 0 and 1 and is called the attenuation (Carroll et al, 2006), the attenuation factor (Freedman et al, 2011a), or the regression dilution factor (MacMahon et al, 1990). The measurement error in $X^*$ attenuates the estimated coefficient, and any relationship with $Y$ appears less strong.

When the measurement error model is linear, and $X$ and $X^*$ are related by $X^*_i = \alpha_0 + a_{0i} + \alpha_X X_i + e_i$ (a variation of the model described in Section 1.1 where the intercept is a random effect at the individual level, with variance denoted by $\text{var}(\alpha_{0})$), the relationship $\beta_{X^*} = \lambda \beta_X$ still holds, but $\lambda$ need no longer lie between 0 and 1, since

$$\lambda = \frac{ \alpha_X \text{var}(X) }{\text{var}(a_0) + \alpha^2_X \text{var}(X)+ \text{var} (e)}.$$

Nevertheless, in nearly all applications $\alpha_X$ is positive, so that negative values of $\lambda$ are virtually unknown. Also, in most applications, $\text{var}(a_0) + \text{var} (e)$ is sufficiently large to render $\lambda$ less than 1, even when $\alpha_X$ is less than 1. However, it is possible for $\lambda$ to be greater than 1, which occurs when $\alpha_X$ is positive but less than 1, and $\text{var}(a_0) + \text{var} (e)$ is less than $\alpha_X (1 -\alpha_X) \text{var} (X) \text{.}$

When the measurement error model is Berkson, $\beta_{X^*} = \beta_X\text{,}$ and there is no attenuation.

NOTE: If the outcome regression is a generalized linear model with

$$h(E(Y|X^*)) = \beta_{0^*} + \beta_{X^*} X^*$$

where $h$ is the link function, then the above results may not be exact, but still hold approximately. For logistic regression, the approximation is good as long as $\beta_X$ is not too large, and the proportion of events is low. Generally speaking, the results and methods that are exact for linear regression outcome models usually provide good approximations for generalized linear outcome models (Carroll et al, 2006, p.79).

Besides attenuating the estimated coefficient relating $X$ to $Y$, measurement error also makes the estimate less precise relative to its expected value, and therefore the statistical power to detect whether it is different from zero is lower. In these same simple situations, the extent of loss of statistical power is governed by the correlation between the measured exposure and the true exposure.

Approximately, the effective sample size is reduced by the factor $\rho^2_{X}\text{,}$ the square of the correlation coefficient between the measured exposure $X^*$ and the true exposure $X$. The term $\rho^2_{X}$ is equal to $\text{var}(X)/( \text{var}(X) + \text{var}(e))$ (Kaaks and Riboli, 1997). This is true whether the measurement error model is classical, linear or Berkson. For the classical model, $\rho^2_{X}$ happens to be equal to the attenuation factor, $\lambda\text{.}$ When measurement error is substantial $(\lambda < 0.5)$, its effects on the results of research studies can be profound, with key relationships being much more difficult to detect.

When there is more than one exposure measured with error and one wishes to evaluate their simultaneous association with an outcome, then other parameters besides attenuation factors govern the magnitude of the bias. These other factors are called contamination factors, and they are related to the residual confounding that occurs because of the measurement error.

### In more detail: Contamination factors

Suppose we wish to relate an outcome, $Y$, to two exposures using the linear regression model

$$E(Y|X_1, X_2) = \beta_{0} + \beta_{X_1 X_1} + \beta_{X_2 X_2}.$$

However, because of measurement problems we use $X^*_1$ instead of $X_1$ and $X^*_2$ instead of $X_2$ and therefore explore the linear regression

$$E(Y|X^*_1, X^*_2) = \beta_{0^*} + \beta_{X^*_1} X^*_1 + \beta_{X^*_2} X^*_2.$$

Results concerning the vectors of coefficients $\beta_{X} = (\beta_{X_1}, \beta_{X_2})^T$ and $\beta_{X^*} = (\beta_{X^*_1}, \beta_{X^*_2})^T$ are different from those in univariate models. When the measurement error model is classical or linear, their relationship may still be written in the form $\beta_{X^*} =\Lambda \beta_X$ but now $\Lambda = \text{cov}(X+e)^{-1} \text{cov}(X) \text{,}$ where $\text{cov}( \;)$ is a variance-covariance matrix and $X$ and $e$ are vectors $(X_1, X_2)^T$ and $(e_1, e_2)^T \text{,}$ the latter denoting the errors in $X^*_1$ and $X^*_2$ respectively. Writing out this relationship fully we obtain,

$$\beta_{X_1^*} = \Lambda_{11} \beta_{X_1} + \Lambda_{12} \beta_{X_2} \\ \beta_{X_2^*} = \Lambda_{21} \beta_{X_1} + \Lambda_{22} \beta_{X_2}.$$

Thus the simple multiplicative relationship between $\beta_{X^*_1}$ and $\beta_{X_1}$ (or between $\beta_{X^*_2}$ and $\beta_{X_2}$) seen for univariate models no longer holds. The diagonal terms of the $\Lambda$ matrix, $\Lambda_{11}$ and $\Lambda_{22}$ are still likely to lie between 0 and 1, so that (for example) $\beta_{X^*_1}$ will contain an attenuated contribution from the true coefficient of $X_1\text{,}$ $(\Lambda_{11} \beta_{X_1})\text{,}$ but $\beta_{X^*_1}$ will also be affected by “residual confounding” from the mismeasured $X_2$ through the term $\Lambda_{12} \beta_{X_2}\text{.}$ Similar remarks apply to $\beta_{X^*_2}$, with residual confounding occurring due to the term $\Lambda_{21} \beta_{X_1}\text{.}$ Thus the estimated coefficients in this model may be larger or smaller than the true target value in a rather unpredictable manner. The off-diagonal terms of $\Lambda$, $\Lambda_{12}$ and $\Lambda_{21}\text{,}$ that govern the amount of residual confounding, have been called contamination factors (Freedman et al, 2011a).

## Impact on studies evaluating the population distribution of an exposure when the exposure is measured with error

In surveillance or monitoring studies measurement error can have an impact on estimating the mean and percentiles of the distribution of the exposure.

When the measurement error model is classical, the estimated mean is unbiased, but the estimated percentiles are biased, with lower percentiles underestimated and upper percentiles overestimated.

When the measurement error model is linear, both estimated mean and estimated percentiles are biased, and the direction of the bias will depend on the parameters of the measurement error model.

Less commonly, when the measurement error model is Berkson, the estimated mean is unbiased, and estimated percentiles are biased, with lower percentiles overestimated and upper percentiles underestimated.

## Impact on studies where the outcome is measured with error

In some studies, interest is in an intervention or an experiment with the intent to modify an outcome of interest, and this outcome is measured with error.

Suppose that our analysis of interest is based on the linear regression model

$$E(Y|X) = \beta_0 + \beta_X X.$$

However, because of measurement problems we use $Y^*$ instead of $Y$ and therefore explore the linear regression

$$E(Y^*|X) = \beta_{0^*} + \beta_{X^*} X.$$

When the measurement error model for $Y^*$ is classical, $\beta_{0^*} = \beta_0$ and $\beta_{X^*} = \beta_X$, and the measurement error introduces no bias in the estimated coefficients. However, the precision with which $\beta_{X^*}$ is estimated using $Y^*$ is lower than that with which $\beta_X$ is estimated using $Y\text{.}$ A consequence is that the power to detect an association between $X$ and the outcome is lower when using $Y^*$ than when using $Y\text{.}$

When the measurement error model for $Y^*$ is linear, and $Y$ and $Y^*$ are related by

$$Y^* = \alpha_0 + \alpha_Y Y + e$$

it follows that $E(Y^*|X) = (\beta_0 \alpha_Y + \alpha_0) + \alpha_Y \beta_X X\text{.}$ Measurement error of this form therefore results in biased estimates of the association between X and the outcome. In particular, $\beta_{X^*} = \alpha_Y \beta_X\text{.}$

When the measurement error model for $Y^*$ is Berkson, estimates of the association between $X$ and the outcome are biased. Recalling that Berkson error in a measured exposure results in no bias in the estimated regression coefficients, one sees that the effects of classical error and Berkson error in an outcome variable are the reverse of their effects in an exposure variable.

NOTE: In the examples above we are assuming that the exposure is measured correctly. Were there also measurement error in the exposure, this would cause bias of the type described above.

1.4. Methods of adjustment to reduce bias in estimates

There are many statistical methods available for addressing the bias in estimates that is caused by measurement error. However, to use these methods one needs information regarding the measurement error model. Typically, such knowledge comes from validation studies.

NOTE: While the methods mentioned here aim at the complete elimination of bias, it is important to understand that in practice they are based upon assumptions about the measurement error model that cannot always be fully verified. To the extent that they deviate from these assumptions, these methods may fall short in their aim to remove all of the bias. It is therefore more realistic to think of them as reducing rather than eliminating the bias due to measurement error. In extreme circumstances, when the form of the measurement error model is badly misspecified (for example, when measurement error is Berkson, but is specified as classical), then applying a measurement error method can actually make estimates more biased than applying the typical analysis unadjusted for measurement error. Checking the form of the measurement error model using data from validation studies is important.

## Methods for studies evaluating the association of an exposure with an outcome when the exposure is measured with error

Methods of measurement error adjustment in etiologic studies include regression calibration, simulation extrapolation, use of instrumental variables, score function methods, likelihood methods, moment reconstruction, multiple imputation, and Bayesian methods. This primer focuses on regression calibration, the most commonly used method.

The main idea of regression calibration is as follows: Since the exposure is measured with error, its true value is not really known. Therefore, in the regression of outcome on exposure, one substitutes for this unknown exposure value its expectation conditional on its measured value and other predictors.

The formula for this conditional expectation is known as the calibration equation. Validation studies are usually conducted to determine the measurement error model, but the data validation studies generate can also be used for determining the calibration equation. In some cases the equation is a linear one, such as when the true exposure, the measured exposure, and other predictors are normally distributed. Often, as an approximation, it is assumed that the calibration equation is linear, and the method then coincides nearly exactly with the method of linear regression calibration (Rosner et al, 1990).

NOTE: The value of the conditional expectation of the exposure for each individual is a predicted value and is suitable for use in regression calibration. However, it is not the same as the individual’s true exposure, and caution is required in using this predicted value for other purposes. For example, it is not correct to use these values to build a distribution of exposure values in the population. Nor is it correct to use the values to classify the individuals into different subgroups of exposure and then estimate relative risks for some outcome between those subgroups. Such procedures yield biased estimates.

Regression calibration yields consistent (asymptotically unbiased) estimates of regression coefficients when (a) the outcome-exposure relationship is a linear or log-linear regression and (b) the form of the calibration equation is correctly specified. For logistic regression and other generalized linear regression models, they are nearly consistent when the effects are small or the measurement error is small (Carroll et al, 2006; Chapter 4). When the outcome is the time to an event and the outcome-exposure model is a proportional hazards model, then the calibration is best done separately on each risk set (Clayton D (1992), Xie et al (2001)).

Most versions of regression calibration do not recover statistical power that is lost due to the measurement error of the instrument. The exception is a version of regression calibration known as enhanced regression calibration in which extra predictors known as instrumental variables are included in the calibration equation that increase the precision with which one may predict the unknown exposure.

### In more detail: Forming the calibration equation

Suppose we wish to estimate the coefficient, $\beta_X$, in a model for relating exposure, $X$, to outcome variable, $Y$

$$E(Y|X, Z) = \beta_0 + \beta_X X + \beta^T_Z Z$$

where $Z$ is a vector of confounding variables. We cannot measure $X$ exactly, but obtain a measure, $X^*\text{,}$ that includes non-differential measurement error. The regression calibration method involves forming a new variable, $X_{RC},$ that equals (or estimates) the expectation of $X$ conditional on $X^*$ and a set of other variables that we will call $V\text{.}$ Thus, the calibration equation is given by $X_{RC} = E(X|X^*,V).$ Then $X_{RC}\text{,}$ or an estimate of $X_{RC}\text{,}$ is used in place of the unknown $X$ in the regression of $Y$ on $X$ and $Z\text{.}$

The formula for $E(X|X^*,V)$ is usually obtained by estimating the regression model of $X$ on $X^*$ and $V\text{.}$ The data for executing this step are generally obtained from a validation study. If the validation study provides data on $X$ in each individual as well as $X^*$ and $V\text{,}$ then the regression model may be developed in the usual manner by regressing $X$ on $X^*$ and $V\text{.}$ Interaction terms and non-linear functions of $X^*$ and $V$ may be introduced, but special care is needed (Midthune et al (2016)).

Furthermore, it may be appropriate to construct separate calibration equations for subgroups defined by elements of $V\text{.}$ Accordingly, one may consider very flexible parameterizations of $E(X|X^*,V) \text{.}$ Use of model diagnostics to check on goodness of fit is recommended. Those elements of $V$ that appear to be unrelated to $X$ may be dropped from the regression equation, as such a finding constitutes evidence that the said element of $V$ is conditionally independent of $X$ conditional on $X^*$ and the other elements of $V\text{.}$

If X is not available, but instead there is available a measurement $X^{\#}$ that is unbiased at the individual level and has errors that are independent of the errors in $X^*$, then the regression model may be developed by regressing $X^{\#}$ on $X^*$ and $V\text{.}$

If the measurement error model is known to be classical, and repeat measurements of $X^*$, e.g. $(X_1^*, X_2^*)\text{,}$ as well as $V$, are available on each individual, then the regression model may be developed by regressing $X_2^*$ on $X_1^*$ and $V\text{.}$

According to the theory, the regression calibration method provides consistent or nearly consistent estimates of $\beta_X \text{.}$ This can be shown in the case of a simple linear outcome model (with no other covariates), and where $X^*$ has non-differential linear measurement error

$$E(Y|X^*) = E\{E(Y|X,X^*)| X^*\} = E\{E(Y|X)| X^*\} = E( \beta_{0} + \beta_{X} X | X^*) = \beta_0 + \beta_X E( X | X^*).$$

For other cases and more complex outcome models, see Carroll et al, 2006: Sections 4.1, 4.7, 4.8, B.3.3.

The following is a set of rules for which variables should be chosen and included in the set, $V$, to ensure that the regression calibration method provides consistent or nearly consistent estimates of $\beta_{X}\text{.}$

1. All variables in $Z$ should be included in $V$ (Carroll et al, 2006: Chapter 4), except for elements of $Z$ that are known explicitly to be independent of $X$ conditional on $X^*\text{.}$ In practice, it is rare to have such knowledge, so all elements of $Z$ are included.

2. Any other variables, $S$, that are known to be independent of $Y$ conditional on $X$ and $Z$ may be included in $V$ (Kipnis et al, 2009). Such variables are sometimes called instrumental variables, but their use here is very different from the way that instrumental variables are usually employed.

3. All other variables correlated with $Y$ conditional on $X$ and $Z$ should not be included in $V\text{.}$

When instrumental variables, $S$, are included in the calibration equation, this version of regression calibration has been called enhanced regression calibration. For examples of its use, see Freedman et al (2011b), where a dietary biomarker is added to the calibration equation, and Carroll et al (2012), where an additional self-report instrument is added. Having knowledge and availability of instrumental variables is currently uncommon (although we encourage designing studies that include a measurement that can serve as $S\text{),}$ so usually $V = Z\text{,}$ the confounders in the outcome model.

## Methods for studies evaluating the population distribution of an exposure when the exposure is measured with error

The methods for reducing bias in estimating the percentiles of the distribution include the National Cancer Institute (NCI) method, the Multiple Source Method ( MSM), the Iowa State University (ISU) method, and the Statistical Program to Assess Dietary Exposure (SPADE) method. All of these were developed specifically for dietary data, and are described in this primer under Section 2, Measurement Error in Nutrition Studies.

## Methods for studies where the outcome is measured with error

Relatively little has been written on methods to reduce the resulting bias; see approaches described in Buonaccorsi (1991), Carroll et al, 2006: 15.4, and Keogh et al (2016).