Simple and Flexible Classification of Gene Expression Microarrays Via Swirls and Ripples

By Stuart G. Baker

The program requires Mathematica 7.01.0

The key function is Classify [datalist,options] where

datalist={data, genename, dataname}

data ={matrix for class 0, matrix for class 1}, matrix is gene expression by specimen genename a list of names of genes,
dataname ={name of data set, name of class0, name of class1}

To reproduce results in article, type ClassifyMS1[ ], ClassifyMS2[ ], ClassifyFig12[ ], ClassifySim1[ ], ClassifySim2[ ]

File Contents

Download All (ZIP, 9 MB)

File name Description
swirl.m calls other programs
swirlcore.m core program
swirlg.m greedy algorithm
swirlw.m wrapper algorithm
swirlsup.m support functions
swirlroc.m computation of ROC curve in test sample
swirlreport.m reporting functions
swirlsim.m generate simulation data
swirlplot.m plotting boundary curves for two genes using data
swirlplotsym.m plot hypothetical boundary curves
swirldata.m create data sets in article


data from Alon et al (1999)
swirldataname1.m gene names from Alon et al (1999)
swirldata20.m class 0 data from Golub et al (1999)
swirldata21.m class 1 data from Golub et al (1999)
swirldataname2.m gene names from Golub et al (1999)
swirldata3.m data from Pomeroy et al (2002)
swirldataname3.m gene names from Pomerory et al (2002)
swirldata4.m data from Singh et al (2002)
swirldataname4.m gene names from Singh al (2002)
swirldata50.m class 0 data from Yeoh et al (2002)
swirldata51.m class 1 data from Yeoh et al (2002)
swirldataname5.m gene names from Yeoh al (2002)

This code is provided "as is", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the NCI or the individual developers be liable for any claim, damages or other liability of any kind. Use of this code by recipient is at recipient's own risk. NCI makes no representations that the use of the code will not infringe any patent or proprietary rights of third parties.

The sources of data are:

  1. Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oglionucleotide arrays. Proc Natl Acad Sci 1999, 96:6745-6750.
  2. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri M.A., Bloomfield CD, Lander ES: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999, 286:531-537.
  3. Pomeroy SL, Tamayo P, Gaasenbeek M, Sturia LM. Angelo M, McLaughlin ME, Kim JYH, Goumnerova LC, Black PM, Lau C, Allen JC, Zagzag D, Olson JM, Curran T, Wetmore C, Biegel JA, Poggio T, Mukherjee S, Rifkin R, Califano A, Stolovitzky G, Louis DN, Mesirov JP, Lander ES, Golub TR: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 2002, 414:436-42.
  4. Yeoh E, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R, Behm FG, Raimondi SC, Relling MV, Patel A, Cheng C, Campana D,Wilkins D, Zhou X, Li J, Liu H, Pui C, Evans WE, Naeve C, Wong L,Downing JR: Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 2002,1:133-143.
  5. Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D'Amico AV, Richie JP, Lander ES, Loda M, Kantoff PW, Golub TR, Selle WR: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 2002, 1: 203-209