Ewout Steyerberg’s in Biometrics vol. 72 no. 3 September 2016, p. 1006-7. doi:10.1111/biom.12569
James E. Helmreich in Journal of Statistical Software vol. 70, 2016. here
First Edition
REGRESSION MODELING STRATEGIES with Applications to Linear Models, Logistic Regression, and Survival Analysis by FE Harrell. The book was published June 5 2001 by Springer New York, ISBN 0-387-95232-2. Click here to see the text from the book’s back cover. Click here to see the preface and table of contents for the book manuscript in .pdf format. Click here to obtain a partial index to the book in .pdf format, and) to here to see a sample chapter from the book ( Note:This material is Copyright 2001-2004 Springer-Verlag and may not be reproduced).
International Journal of Epidemiology 31(3):699-700, June 2002. Note: This otherwise excellent review states that the book recommends selecting variables to include in the model on the basis of their frequency of selection by a bootstrap procedure. This is definitely not the case.
Journal of the American Statistical Association 98:257-258, March 2003
Medical Decision Making, 23(2):182-183, April 2003
Errata for the first and second printings. The book had its third printing in December 2002 and its fourth printing in December 2003. The sixth printing was in December 2005.
New versions of R code that makes some examples in the book relying on the Design package to work with the rms package
One-semester course using part of the text, for students who have not had a course in linear regression.
Offered for the first time in the Vanderbilt University Department of Biostatistics graduate program Spring 2013 (Jan-Apr). It is taught yearly by Prof. Harrell
Materials
See full semester course for up-to-date material
Survey of new approaches to regression and tree-based modeling (referred to in Chapter 4 of the second edition)
Syllabus for a 1-day short course “Modern Approaches to Predictive Modeling and Covariable Adjustment in Randomized Clinical Trials”
To be added from rmsdisc: An older discussion board for readers and the author to discuss questions, issues, controversies, and new research related to the text
Quizzes (with answer sheets) on concepts in the text and on prerequisites, are available to instructors by E-mailing the author
Software
To be added from BioMod#FittingDemos: Interactive scripts demonstrating various curve fitting criteria and showing the flexibility of restricted cubic splines (see also this)
Warren Sarle’s SAS macros and examples for bootstrapping and jackknifing. See Warren’s cautionary note on bootstrap confidence intervals, with a good example related to R^2 in multiple regression. The example shows that when the estimate of R^2 is badly biased, bootstrap confidence limits are badly displaced to the right. Included in the notes is the standard error of R^2 and information about adjusted R^2.
function for binary logistic model external validation
Studies of Methods Used in the Text
Recent simulation experiments conducted by Carl Moons and Frank Harrell indicate that the performance of transcan for multiple imputation is about halfway between single conditional mean imputation and MICE (see below), consistent with the findings from Faris PD, Ghali WA, et al (2002): Multiple imputation versus data enhancement for dealing with missing data in observational health care outcome analyses. J Clin Epidemiology55:184-191. Suboptimal performance of transcan for multiple imputation is probably due to the fact thattranscan fits the flexible additive imputation models and then draws all multiple imputations from the fitted models. A new function in the Hmisc package, aregImpute, uses the bootstrap to re-fit additive nonparametric imputation models for each of the multiple imputations. Results for aregImpute are very promising (see below).
Steyerberg EW, Harrell FE, Borsboom GJJM, Eijkemans MJC, Vergouwe Y, Habbema JDF (2001): Internal validation of predictive models: Efficiency of some procedures for logistic regression analysis. Journal of Clinical Epidemiology54:774-781.
Steyerberg EW et al (2003): Internal and external validation of predictive models: A simulation study of bias and precision in small samples. Journal of Clinical Epidemiology56:441-447.
Vergouwe Y, Steyerberg EW, Eijkemans MJC, Habbema JDF (2005): Substantial effective sample sizes were required for external validation studies of predictive logistic regression models. Journal of Clinical Epidemiology58:475-483.
Shrinkage and problems with stepwise variable selection: See Steyerberg EW, Eijkemans MJC, Harrell FE, Habbema JDF (2001): Prognostic modeling with logistic regression analysis: In search of a sensible strategy in small data sets. Medical Decision Making21:45-56.
Model simplification and stepwise variable selection: See Ambler G, Brady AR, Royston P (2002): Simplifying a prognostic model: a simulation study based on clinical data. Statistics in Medicine21:3803-3822. The authors studied the performance of the model simplification strategy discussed in the book, and compared it with more traditional variable selection methods, finding that standard variable selection can work well when there is a large proportion of irrelevant variables.
New case study on penalized maximum likelihood estimation for binary logistic modeling: Moons KGM, Donders ART, Steyerberg EW, Harrell FE (2004): Penalized maximum likelihood estimation to directly adjust diagnostic and prognostic prediction models for overoptimism: a clinical example. J Clinical Epidemiology57:1262-1270.
To subscribe to the Impute E-mail discussion group led by Juned Siddique of Northwestern University, click here.
A paper containing a good overview of multiple imputation and a comparison of some software packages is Horton NJ, Lipsitz SR, The American Statistician 55:244-254; 2001.
An excellent recent survey of missing data methods is Schafer, JL and Graham JW, Psychological Methods 7:147-177; 2002.
Notes from Tim Hesterberg on why the response variable must be used when doing multiple imputation. Tim’s notes include code to do several simulations illustrating his points.
Comparisons of aregImpute with other imputation algorithms