Biostatistics for Biomedical Research


Department of Biostatistics
School of Medicine
Vanderbilt University


May 23, 2024

flowchart LR
Q[Research<br>Question] --> M[Measurements] --> D[Design] --> Ac[Data<br>Acquisition] --> Des[Description] --> A[Analysis] --> I[Interpretation] & Pred[Prediction]
Pred --> V[Validation]
I --> K[New Knowledge] & Dec[Decisions]


The book is aimed at exposing biomedical researchers to modern biostatistical methods and statistical graphics, highlighting those methods that make fewer assumptions, including nonparametric statistics and robust statistical measures. In addition to covering traditional estimation and inferential techniques, the course contrasts those with the Bayesian approach, and also includes several components that have been increasingly important in the past few years, such as challenges of high-dimensional data analysis, modeling for observational treatment comparisons, analysis of differential treatment effect (heterogeneity of treatment effect), statistical methods for biomarker research, medical diagnostic research, and methods for reproducible research. A glossary of statistical terms for non-statisticians is here.BBR course R Workflow is a useful companion to this book, especially for those needing to manipulate data in preparation for analysis and for those interested in embedding statistical analyses in state-of-the-art reproducible reports.

For information about adding annotations, comments, and questions inside the text click here: Comments

Symbols Used in the Right Margin of the Text

  • Blue symbols in the right margin starting with ABD designate section numbers (and occasionally page numbers preceeded by \(p\)) in The Analysis of Biological Data, Second Edition by MC Whitlock and D Schluter, Greenwood Village CO, Roberts and Company, 2015.
  • Right blue symbols starting with RMS designate section numbers in Regression Modeling Strategies, 2nd ed. by FE Harrell, Springer, 2015.
  • in the right margin is a hyperlink to a YouTube video related to the subject.
  • is a hyperlink to the discussion topic in devoted to the specific YouTube video session. You can go directly to the discussion about session n by going to Some of the sessions on YouTube also had live chat which you can select to replay while watching the video.
  • Boxed blue text in the right margin represents a mnemonic key for linking to discussions about that section in datamethods. Anyone starting a new discussion about a topic related to the section should include the mnemonic somewhere in the posting. When you click on the blue boxed text the datamethods search result of all topics containing that mnemonic will appear, and the user can navigate from it to the topic of interest to read or add content.
  • An audio player symbol indicates that narration elaborating on the notes is available for the section. Red letters and numbers in the right margin are cues referred to within the audio recordings.
  • blog in the right margin is a link to a blog entry that further discusses the topic.

Other Information


This material grew largely out of teaching clinical scholars and in Master of Science in Clinical Investigation programs at Duke University, University of Virginia, and Vanderbilt University. I benefitted immensely from lecture notes from colleagues such as Kerry Lee of Duke University. Thanks also goes to Vanderbilt Biostatistics colleague James C. Slaughter who made several contributions to an earlier version of the book at

Date Sections Changes Thanks To
2024-04-16 14.4.3 KCCQ Ceiling Effect New subsection on KCCQ ceiling effect problem
2024-04-16 Nearly Optimal Statistical Model New subsection on optimal model to replace change score
2023-11-10 7.9 Regression Analysis of Paired Data Fixed mixed effects ordinal model for paired rank test by using quadrature
2023-09-22 20.3.4 One-at-a-Time Bootstrap Feature Selection New section on bootstrapping importantance ranks using one-at-a-time feature modeling
2023-09-16 20.3.3 Sample Size to Estimate a Correlation Matrix New section on estimation of correlation matrices
2023-07-28 7.9 Regression Analysis of Paired Data New section on using models for paired data
2023-07-26 7.8 Two-Way ANOVA Ordinal Regression Example Added example of ordinal model for 2-way ANOVA
2023-06-22 13  Analysis of Covariance in Randomized Studies Added big picture
2023-06-16 13.5 How Many Covariables to Use? Added more to section on how many covariates to add
2023-04-27 7.12 Sample Size Requirement for Characterizing Entire Distributions New section on sample size for ECDF
2023-04-05 14.4.2 Example of a Misleading Change Score Added confidence bands
2023-03-30 20.3.1 Simulation To Understand Needed Sample Sizes Fixed bug in simulation graphics
2023-03-29 3.2.1 Statistical Scientific Method New link to clinical trial design resource
2023-03-13 21  Reproducible Research New subsection on the decline effect
2023-02-19 3.9 Probability Added link to resources for learning probability
2022-12-29 4.3.5 Graphs for Describing Statistical Model Fits Added single-axis nomogram example
2022-12-28 Started to add old study questions to end of selected chapters
2022-12-03 14.4.2 Example of a Misleading Change Score New section with real example of misleading change score
2022-11-27 14.4.5 Current Status vs. Change New section on importance of current status vs. baseline status and irrelevance of change for patients
2022-08-02 19  Diagnosis Quote about weaknesses in sens and spec; link to CrossValidated discussion
2022-08-31 8.5.2 Sample Size for r New material on sample size vs. P(correct sign on r)