Bios 7330: RMS Syllabus
Numbers to the right of topics indicate sequential lecture numbers. Hn stands for Harrell Chapter n in the book’s second edition. Ln stands for lecture n.
Introduction (H1) L1
- Course overview and logistics
- Course philosophy
- Hypothesis testing vs. estimation vs. prediction
- Examples of multivariable prediction problems
- Misunderstandings about classification vs. prediction (read this also)
- Study planning considerations
- Choice of model
- Model uncertainty/data driven model selection/phantom d.f.
General methods for multivariable models (H2) L2
- Notation for general regression models
- Model formulations
- Interpreting model parameters
- nominal predictors
- interactions
- Review of chunk tests
- Relaxing linearity assumption for continuous predictors
- avoiding categorization - see also BBR Sections 18.3.2-18.3.3
- nonparametric smoothing
- simple nonlinear terms (L3)
- splines for estimating shape of regression function and determining predictor transformations
- cubic spline functions
- restricted cubic splines
- see interactive demos of spline fitting and continuity here
- see Computing and fitting monotone splines
- nonparametric regression (smoothers)
- advantages of splines over other methods
- recursive partitioning and tree models in a nutshell
- Bayesian spline modeling: watch McElreath’s presentation
- New directions in predictive modeling (L4)
- Tests of association
- Grambsch and O’Brien paper
- Assessment of model fit
- regression assumptions
- modeling and testing complex interactions
- interactions to prespecify
- distributional assumptions
Missing data (H3, L5)
- Types of missing data
- Prelude to modeling
- Missing values for different types of response variables
- Problems with alternatives to imputation
- Strategies for developing imputation models
- Single imputation
- Predictive mean matching
- Multiple imputation
- The =aregImpute= algorithm (L6)
- Diagnostics
- Summary and rough guidelines; effective sample size
Multivariable modeling strategy (H4)
- Pre-specification of predictor complexity
- Variable selection
- Sample size, overfitting, and number of predictors (L7); also see this
- Shrinkage
- Collinearity
- Data reduction
- Overly influential observations (L8)
- Comparing two models
- Improving the practice of multivariable prediction
- Overall modeling strategies
Bootstrap, Validating, Describing, and Simplifying the Model (L9, H5)
- Describing the fitted model; see also this
- Bootstrap; see also Section 8.6 of BBR
- Model validation; see also this and this
- Bootstrapping ranks of predictors (L10); see also this
- Simplifying the model by approximating it
- How do we break bad habits?
R Multivariable Modeling/Validation/Presentation Software (L10, H6, BBR9)
Case Study in Longitudinal Data Modeling with Generalized Least Squares (H7, L11)
- Notation and model for mean time-response profile
- Keeping baseline variables as baseline
- Modeling within-subject dependence
- Overview of competing methods for serial data
- Checking model fit
- Software
- Case study from a randomized trial
- Bayesian re-analysis of case study using semiparametric longitudinal models
Case study in data reduction (H8, L12)
- How many parameters can be estimated?
- Redundancy analysis
- Variable clustering
- Transformation/scaling of variables using =transcan=
- Principal components Cox regression
- Sparse principal components
- Nonparametric transform-both-sides regression for transforming/scaling variables
Maximum Likelihood Estimation (H9, L13) | Chiara Di Gravio Lecture Notes | Donald Hedeker’s Notes
- Three test statistics
- Robust covariance matrix estimator
- Correcting variances for clustered or serial data using sandwich and bootstrap estimators
- Confidence regions
- Wald (large-sample normal approximation)
- Bootstrap
- Simultaneous (normal approx)
- General contrasts through differences in linear predictor
- Further use of the log likelihood and here
- Weighted MLE
- Penalized MLE
- Effective d.f.
Binary Logistic Model (H10, L15)
- Model
- Odds ratios, risk ratios, and risk differences
- Detailed example
- Estimation
- Test statistics
- Residuals
- Assessment of model fit
- Quantifying predictive ability
- Validating the model
- Describing fitted models
- R functions
- Bayesian re-analysis of age-sex-response data; see here for analysis using the
brms
package
Binary Logistic Case Study 1 (H11, L16)
Binary Logistic Case Study 2 (H12, L17)
Ordinal Logistic Models (H13, L18, BBR 7.6)
- Ordinality assumption
- PO Model
- CR Model
- Model
- Assumptions, interpretation of parameters, estimation, residuals
- Assessment of fit
- Extended CR model including penalization
- Validation
- R functions
Ordinal Logistic Regression Case Study (H14, L19)
Case Study in Ordinal Regression for Continuous Univariate Y (H15, L20-21)
- No transformation satisfying all linear model assumptions exists for the dataset
- Assumptions of the proportional odds ordinal logistic model (semiparametric model) are not satisfied
- Development and validation of a quantile regression model for median glycohemoglobin
- Failure of linear multiple regression
- Failure of proportional odds model for continuous gh
- Comparison with quantile regression
- Obtaining many types of predicted values
Transform-both-sides Nonparametric Additive Regression Models (H16, L21)
- Generalized additive models
- ACE
- AVAS
- Parametric approach
- Obtaining estimates on the original scale
- Smearing estimator
- R
areg.boot
function - Examples
Some Components of Survival Analysis and Parametric Survival Models (H17-H18, L22)
Parametric Survival Model Case Study (H19, L23)
Cox Model (H20), Cox Model Case Study (H21) (L24)
Analysis of Covariance in Randomized Trials (BBR Chapter 13, L25)
Medical Diagnostic Research (BBR Chapter 19, L26)
Longitudinal Modeling (L27-L28) (H22)
- BBR Chapter 15
- TWSTRS PO model analyses - course notes sections 7.8.3 and especially 7.8.4
- Markov longitudinal proportional odds models
- Student discussions of example problems and their attacks