Skip to content
National Cancer Institute National Cancer Institute U.S. National Institutes of Health
Division of Cancer Prevention logo
Home Site Map Contact DCP
Programs & Resources

Division of Cancer Prevention Staff

Stuart G. Baker, ScD

Mathematical Statistician
Biometry Research Group
Division of Cancer Prevention
National Cancer Institute
9609 Medical Center Drive, Room 5E638
Rockville, MD 20850
Phone (240) 276-7147
Fax (240) 276-7845
  • Dr. Stuart G. Baker ( develops statistical methods with applications in biology and medicine and also writes commentaries on theories of carcinogenesis. Examples include:

    1. the concept of "paradigm instability," based on a buried treasure analogy, (Baker et. al. 2010, Baker 2013) for discussing somatic mutation theory versus the tissue organization field theory of carcinogenesis;
    2. the paired availability design for historical controls, which adjusts for different availabilities of treatment in different centers using a principal stratification model independently developed by Permutt and Hebel (1989), Baker and Lindeman (1994), Angrist, et al. (1996);
    3. the multinomial-Poisson transformation for easily computing variances of saturated multinomial distributions (Baker,1994);
    4. composite linear models for analyzing categorical data subject to ignorable or non-ignorable missing data mechanisms (1994);
    5. the BK-Plot for illustrating mixtures of probabilities, independently developed for Simpson's paradox by Tan (1986), Jeon et. al. (1987), and Baker and Kramer (2001) and also applied to the Prentice Criterion for surrogate endpoints (Baker, 2013);
    6. relative utility curves for evaluating risk prediction via decision analysis (Baker et al, 2009, Baker et al. 2012);
    7. Swirls and Ripples, a centroid-based analysis that can include "islands" of one class within another class, called Swirls, or the usual curved boundaries, called Ripples (Baker, 2010);
    8. a leave-one-out approach for evaluating surrogate endpoints (Baker et. al. 2012);
    9. a biomarker pipeline to develop and evaluate cancer screening tests (Baker 2009);
    10. a latent-class model for the genetic analysis of twin data (Baker et. al. 2005), which is a major improvement over the usual variance components model involving heritability;
    11. a method to compare biologically relevant response curves in gene expression experiments in terms of heteromorphy, heterochrony, and heterometry (Baker 2014).

    Dr. Baker was the first recipient of the distinguished alum award from the Department of Biostatistics at the Harvard School of Public Health. He is also a fellow of the American Statistical Association and an elected member of the International Statistical Institute.

  • Mathematica

Back to top