Software for Measurement Error in Nutrition Research

3.1. NCI software for different applications

NCI Method Software Applications and Zip Filenames July 2018

Download the file (PDF, 434.62 KB)

The NCI software comprises a suite of SAS macros that may be used for adjusting for the effects of dietary measurement error in different applications.

Although the conditional expectation of the exposure can be computed and then entered together with the outcome and other explanatory variables into a standard statistical regression package to yield estimated regression coefficients that are approximately unbiased, the standard errors, confidence intervals and statistical tests produced by the package will not be correct because they do not take into account the uncertainty introduced at the calibration stage. Therefore it is advisable to use special programs for regression calibration to perform the analysis.

For estimating the association between usual intake and a health outcome

Food frequency questionnaire (FFQ) is the main instrument

For this set of applications, it is assumed that all participants completed one food frequency questionnaire and a subsample of participants also completed at least two 24-hour recalls.

If all participants completed both a food frequency questionnaire and a 24-hour recall and a subsample completed at least two 24-hour recalls, then it is recommended the user apply software applications for when 24-hour recall is the main instrument. In this case, the food frequency questionnaire data may be used in the analysis as a covariate in the modeling of the predicted usual intake.

Software are available for the following dietary variables:

For estimating the association between usual intake and a health outcome

24-hour recall is the main instrument

For this set of applications, it is assumed that all participants completed one 24-hour recall and a subsample of participants completed at least two 24-hour recalls.

Software are available for the following dietary variables:

For estimating usual intake distribution

24-hour recall is the main instrument

For this set of applications, it is assumed that all participants completed one 24-hour recall and a subsample of participants completed at least two 24-hour recalls. Software allowing inclusion of individuals who do not complete at least one 24 hour recall is not included.

Software are available for the following dietary variables:

3.2. Potential misuses of NCI software

There are several ways in which the output of the NCI software could be misunderstood or misused. Below are the most common potential misuses that have been recognized.

  • The predicted usual intakes of a dietary component can be calculated for each individual in a study using the INDIVINT macro of the NCI software. These predicted values can be used in the specific regression models for which they were constructed (the method of regression calibration). However, the same predicted values cannot be used to group these individuals into “categories (e.g. quintiles) of usual intake”.
     
  • Following on from the previous bullet, one cannot use categories formed from INDIVINT predicted usual intakes in regression models so as to obtain relative risks between categories of usual intake. The NCI software does not provide a direct way of estimating relative risks between categories of usual intake. However, if one needs to estimate, for example, the relative risk between the 5th and 1st quintiles of usual intake of a dietary component, an indirect way is to do this with the NCI software is: (a) estimate the 10th and 90th percentiles of the usual intake distribution of the dietary component using DISTRIB; (b) estimate the risk function for the usual intake on a continuous scale using the predicted values from INDIVINT; and then (c) take the ratio of the estimated risk at the 90th to the estimated risk at the 10th percentile.
     
  • The predicted usual intakes of a dietary component that are calculated for each individual in the INDIVINT macro of the NCI software to use in a specific regression model cannot be used directly to construct a population distribution of usual intakes. The DISTRIB macro is needed for that job. Predicted usual intakes from INDIVINT will seriously underestimate the spread of the population distribution if so used.
     
  • If one wants to study how the usual intake of one dietary component is related to the usual intake of another component, one needs to use the bivariate or multivariate versions of the NCI software. For example, if the investigator is interested in the relation between usual intakes of vitamin D and calcium, it is inappropriate to split the sample of individual people into high and low vitamin D subsets based on their first 24HR value, run the univariate NCI method programs for calcium intake on the two subsets separately, and then compare the estimated calcium distributions. Instead the bivariate NCI method programs should be used.
3.3. Other measurement error software

Software within STATA is available to do regression calibration. The name of the command is rcal and is found within the STATA package merror.

Further details on the rcal command can be found at the webpage of Professor Raymond Carroll, University of Texas A&M, who was instrumental in developing the merror package.

There is also a program in STATA called eivreg that performs “errors in variables regression.”

The webpage for the Center for Methods in Implementation and Prevention Science (CMIPS), Yale School of Public Health, is another source of software for executing methods of measurement error correction written in SAS. Of particular relevance are the following programs that perform regression calibration:

%blinplus implementing Rosner B, Spiegelman D, Willett W. Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. American Journal of Epidemiology 1990;132: 734-735.

%relibpls8 implementing Rosner B, Spiegelman D, Willett W, Correction of logistic regression relative risk estimates and confidence intervals for random within person measurement error. American Journal of Epidemiology 1992; 136: 1400-1413.

%rrc implementing the method developed in Liao X, Zucker D, Li Y, Spiegelman D. Survival analysis with error-prone time-varying covariates: a risk set calibration approach. Biometrics 2011 Mar; 67(1):50-58.