## NCI Method Software Applications and Zip Filenames July 2018

Download the file

(PDF, 435 KB)

The NCI software comprises a suite of SAS macros that may be used for adjusting for the effects of dietary measurement error in different applications.

Although the conditional expectation of the exposure can be computed and then entered together with the outcome and other explanatory variables into a standard statistical regression package to yield estimated regression coefficients that are approximately unbiased, the standard errors, confidence intervals and statistical tests produced by the package will not be correct because they do not take into account the uncertainty introduced at the calibration stage. Therefore it is advisable to use special programs for regression calibration to perform the analysis.

*For estimating the association between usual intake and a health outcome*

*Food frequency questionnaire (FFQ) is the main instrument*

For this set of applications, it is assumed that all participants completed one food frequency questionnaire and a subsample of participants also completed at least two 24-hour recalls.

If all participants completed both a food frequency questionnaire and a 24-hour recall and a subsample completed at least two 24-hour recalls, then it is recommended the user apply software applications for when 24-hour recall is the main instrument. In this case, the food frequency questionnaire data may be used in the analysis as a covariate in the modeling of the predicted usual intake.

Software are available for the following dietary variables:

- Single regularly-consumed or episodically-consumed food or nutrient
- Single nutrient density or ratio of two components (the denominator must be regularly-consumed)
- Single food or nutrient that has never-consumers
- Two regularly-consumed or one regularly-consumed and one episodically-consumed foods or nutrients
- Several regularly-consumed or episodically-consumed foods or nutrients
- Several foods or nutrients, one of which has never consumers

*For estimating the association between usual intake and a health outcome*

*24-hour recall is the main instrument*

For this set of applications, it is assumed that all participants completed one 24-hour recall and a subsample of participants completed at least two 24-hour recalls.

Software are available for the following dietary variables:

- Single regularly-consumed or episodically-consumed food or nutrient
- Single nutrient density or ratio of two components (the denominator must be regularly-consumed)
- Single food or nutrient that has never-consumers
- Two regularly-consumed or one regularly-consumed and one episodically-consumed foods or nutrients
- Several regularly-consumed or episodically-consumed foods or nutrients
- Several foods or nutrients, one of which has never-consumers

*For estimating usual intake distribution*

*24-hour recall is the main instrument*

For this set of applications, it is assumed that all participants completed one 24-hour recall and a subsample of participants completed at least two 24-hour recalls. Software allowing inclusion of individuals who do not complete at least one 24 hour recall is not included.

Software are available for the following dietary variables:

- Single regularly-consumed or episodically-consumed food or nutrient
- Single nutrient density or ratio of two components (the denominator must be regularly-consumed)
- Two regularly-consumed or one regularly-consumed and one episodically-consumed foods or nutrients (bivarariate distribution)
- Several regularly-consumed or episodically-consumed foods or nutrients (multivariate distribution)
- Single food or nutrient with never-consumers

There are several ways in which the output of the NCI software could be misunderstood or misused. Below are the most common potential misuses that have been recognized.

- The predicted usual intakes of a dietary component can be calculated for each individual in a study using the INDIVINT macro of the NCI software. These predicted values can be used in the specific regression models for which they were constructed (the method of regression calibration). However, the same predicted values cannot be used to group these individuals into ācategories (e.g. quintiles) of usual intakeā.

Ā - Following on from the previous bullet, one cannot use categories formed from INDIVINT predicted usual intakes in regression models so as to obtain relative risks between categories of usual intake. The NCI software does not provide a
*direct*way of estimating relative risks between categories of usual intake. However, if one needs to estimate, for example, the relative risk between the 5^{th}and 1^{st}quintiles of usual intake of a dietary component, an indirect way is to do this with the NCI software is: (a) estimate the 10^{th}and 90^{th}percentiles of the usual intake distribution of the dietary component using DISTRIB; (b) estimate the risk function for the usual intake on a continuous scale using the predicted values from INDIVINT; and then (c) take the ratio of the estimated risk at the 90^{th}to the estimated risk at the 10th percentile.

Ā - The predicted usual intakes of a dietary component that are calculated for each individual in the INDIVINT macro of the NCI software to use in a specific regression model cannot be used directly to construct a population distribution of usual intakes. The DISTRIB macro is needed for that job. Predicted usual intakes from INDIVINT will seriously underestimate the spread of the population distribution if so used.

Ā - If one wants to study how the usual intake of one dietary component is related to the usual intake of another component, one needs to use the bivariate or multivariate versions of the NCI software. For example, if the investigator is interested in the relation between usual intakes of vitamin D and calcium, it is inappropriate to split the sample of individual people into high and low vitamin D subsets based on their first 24HR value, run the univariate NCI method programs for calcium intake on the two subsets separately, and then compare the estimated calcium distributions. Instead the bivariate NCI method programs should be used.

Software within STATA is available to do regression calibration. The name of the command is **rcal** and is found within the STATA package **merror**.

Further details on the **rcal** command can be found at the webpage of Professor Raymond Carroll, University of Texas A&M, who was instrumental in developing the **merror** package.

There is also a program in STATA called eivreg that performs āerrors in variables regression.ā

The webpage of Professor Donna Spiegelman, Harvard University School of Public Health is another source of software for executing methods of measurement error correction written in SAS. Of particular relevance are the following programs that perform regression calibration:

%blinplus implementing Rosner B, Spiegelman D, Willett W. Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. *American Journal of Epidemiology* 1990;132: 734-735.

%relibpls8 implementing Rosner B, Spiegelman D, Willett W, Correction of logistic regression relative risk estimates and confidence intervals for random within person measurement error.Ā *American Journal of Epidemiology* 1992; 136: 1400-1413.

%rrcĀ implementing the method developed in Liao X, Zucker D, Li Y, Spiegelman D.Ā Survival analysis with error-prone time-varying covariates: a risk set calibration approach.Ā *BiometricsĀ *2011 Mar; 67(1):50-58.