Skip to content
National Cancer Institute National Cancer Institute U.S. National Institutes of Health www.cancer.gov
Division of Cancer Prevention logo
Home Site Map Contact DCP
Programs & Resources

Biometry Research Group

Statistical Software

Evaluating predictive markers in a randomized trial with binary outcomes

Stuart G. Baker, 2014

Introduction

A predictive marker is a baseline variable in a randomized trial that is used to determine subgroups in which the effect of treatment is greater than average. This software uses a modified adaptive signature design to evaluate a randomized trial with a binary outcome and multiple baseline variables (possibly high dimensional). The software splits the data into training and test samples. For the training sample the software fits various benefit functions. A fundamental option is whether to use cross-validation in the training sample to select the best set of benefit functions (Method 1) or directly fit models in training sample (Method 2). For the test sample the software computes benefit scores based on the benefit function and treatment effect in subgroups with benefit scores greater than cutpoints. The software plots estimated treatment effect versus cutpoint, which is similar to a tail-oriented subpopulation treatment effect pattern plot.

Requirement

Mathematica Version 8 Exit Disclaimer or later.

Set-Up

copy all files into some folder called "FOLDER"
start a new Mathematica session
type SetDirectory["FOLDER"]
type<< trialfit.m (M File, KB)



To run simulation:

typetrialFit[datasim, NewFitQ->True]



To run hypothetical data based on microarray with risk difference benefit function:

typetrialFit[dataPC, Method->2, ModelSet->{{"RD","F"}}]



To try on your own data,

typetrialFit[dataset, options]



Options

 Default Explanation
Method11 is a cross-validation of training sample 2 is fit model directly to training sample
NewFitQTRUENew fitting or use stored result of previous fit
Split0.5Fraction split into test sample
ModelSet"All"Models used e.g. {{"RD","F"},{"Cadit","M"}}
MinTrainTestGroupSize20Minimum sample size of training-test sample
Num Cut8Number of cutpoints in test sample
ShowProgQFALSEShow progress of fitting algorithm
ShowCVQFALSEShow details of cross-validation
ShowTabQFALSEShow data and results tables
MaxBoot20Number of bootstrap iterations


dataset={x0,x1,y0,y1,xname,datasetname}

x0n x g matrix of baseline variables for randomization group 0
x1n x g matrix of baseline variables for randomization group 1
y0a length n list of binary outcomes (0 or 1) for randomization group 0
y1a length n list of binary outcomes (0 or 1) for randomization group 1
xnamea length g list of names of baseline variables
datasetnamename of dataset


Downloads

Download All (zip, 1.59MB)

trialfit.m
(M File, 13KB)
calls all files
trialfitinputcheck.m
(M File, 9KB)
Checks input
trialfitcv.m
(M File, 3KB)
Method 1: cross-validation of training sample
trialfitcvchoose.m
(M File, 6KB)
Method 1: choose benefit function from cross-validation
trialfitmodel.m
(M File, 3KB)
Method 2: fit models
trialfitsplit.m
(M File, 5KB)
Split data into training and test samples
trialfitlogit.m
(M File, 6KB)
Fit logistic regression
trialfitplot.m
(M File, 7KB)
Plot results
trialfitboot.m
(M File, 3KB)
Bootstrap test sample
trialfitbootci.m
(M File, 4KB)
Boostrap confidence interval
trialfitrd.m
(M File, 4KB)
Risk difference
trialfitrdtrain.m
(M File, 2KB)
Risk difference: training sample
trialfitrdtest.m
(M File, 2KB)
Risk diference: test sample
trialfitrdb.m
(M File, 5KB)
Risk difference with boosting
trialfitrdc.m
(M File, 5KB)
Risk difference common variables
trialfitrdctrain.m
(M File, 1KB)
Risk difference common variables: training sample
trialfitresp.m
(M File, 4KB)
Responder only
trialfitresptrain.m
(M File, 2KB)
Responder only: training sample
trialfitresptest.m
(M File, 2KB)
Responder only: test sample
trialfitcadit.m
(M File, 5KB)
Cadit
trialfitcaditrain.m
(M File, 3KB)
Cadit: training sample
trialfitcadittest.m
(M File, 2KB)
Cadit: test sample
trialfitcaditb.m
(M File, 5KB)
Cadit with boosting
trialfitvote.m
(M File, 4KB)
Vote
trialfitvotetrain.m
(M File, 4KB)
Vote: training sample
trialfitvotetest.m
(M File, 3KB)
Vote: test sample
trialfitmax.m
(M File, 4KB)
Max
trialfitmaxtrain.m
(M File, 10KB)
Max: training sample
trialfitmaxtest.m
(M File, 3KB)
Max: test sample
trialfitboost.m
(M File, 3KB)
Boosting computations
trialfitsim.m
(M File, 5KB)
Generate simulated data
trialfitdatamicro.m
(M File, 3KB)
Create hypothetical example from prostate cancer (PC) microarray data
trialfitrawmicro.m
(M File, 4.76MB)
Raw prostate cancer (PC) microarray data for hypothetical example
trialfitfig.m
(M File, 7KB)
Schematic figure


Disclaimer

This code is provided "as is", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and non-infringement. In no event shall the NCI or the individual developers be liable for any claim, damages or other liability of any kind. Use of this code by recipient is at recipient's own risk. NCI makes no representations that the use of the code will not infringe any patent or proprietary rights of third parties.

Last updated: July 30, 2014

Back to top