Markov Longitudinal Ordinal Model
1 Introduction
1.1 Setting and Goals
- Setting: Study with daily/weekly/monthly binary, ordinal, or continuous outcome assessments \(Y\)
- Understanding time course of treatment
- Avoid picking a single day
- Stay close to the raw data; avoid data reduction
- Exception: multiple outcomes on a single day; use worst
- If two events commonly occur on the same day can make a new category that is worse than either event alone (e.g., ventilator + rescue medication)
- Increase power by considering “class calls” instead of reducing data to time to a specific condition
- Not have to decide whether an early milder event is worse than a later major event
- Allow for complex correlation structures including absorbing states
- Maximizing power
- Handling missing data
- Time-to-event analysis is challenged by missing days of assessment
- Longitudinal analysis can handle this (if MAR) as well as partial information
- one range with the ordinal scale may be missing
- if no clinical event overrides that day, treat the unassessed scale interval as interval censored
- if clinical event override present, missing assessments irrelevant
- one range with the ordinal scale may be missing
- Longitudinal analysis can handle this (if MAR) as well as partial information
- Time-to-event analysis is challenged by missing days of assessment
1.2 Example of Power Increase
- VIOLET trial from the PETAL network, NHLBI
- Vitamin D in critically ill adults
- Daily assessments for 28d on 4-level ordinal \(Y\): home, hospitalized, vent/ARDS, dead
- See here for details and code
- For transition OR=0.8, power of Cox test for time to recovery 0.2, power of Markov PO model using all daily assessments 0.9.
- Efficiency gains from using more days (1 to all 28) of measurements:
2 Advantages of Discrete Time Markov Models
- One method for binary, ordinal, nominal, continuous \(Y\)
- General correlation structures, including nearly absorbing states
- Patients in hospital for many days tend to change less from day-to-day
- Previous state \(\times\) time interaction
- Patients in hospital for many days tend to change less from day-to-day
- Correlations are on the data scale so no limits on how high within-pt correlations can be
- Multi-state models yield readily interpreted estimands included a host of derived estimands
- Easier to interpret than competing risk analysis; death is an explicit outcome
- More work needed if early deaths are not to be counted as the worst outcomes (e.g., if known to be caused by late-stage disease and not treatment)
3 Alternatives
- Marginal ordinal transition models (Lee and Daniels 2007, Schildcrout 2021)
- More complex (dual models), requires customized programming, computation time longer
- Easy to interpret results
- More complex (dual models), requires customized programming, computation time longer
- Mixed effects models
- Easy to implement (especially in a Bayesian model) and fairly easy to interpret results (conditional on patient but not on previous state)
- Cannot handle absorbing or nearly absorbing states
- Unrealistic correlation structure assumption can destroy operating characteristics
- See hbiostat.org/R/rmsb/notes.html
4 Markov Proportional Odds Model
- Restrict attention to first-order models for now
- Can use the Peterson & Harrell (1990) partial proportional odds model to relax the PO assumption
- Frequentist: R
VGAM
package (constrained and unconstrained PPO model)- Bayesian: R
rmsb
package (constrained PPO model and interval censoring of \(Y\) at a given time)
- Bayesian: R
- Frequentist: R
- When PO is assumed uses only standard software
- Can efficiently handle > 6000 distinct \(Y\) values if use R
rms
packageorm
function, i.e., can handle continuous \(Y\)
- Can efficiently handle > 6000 distinct \(Y\) values if use R
- Has a number of special cases
- Single time: ordinary PO or PPO model
- \(Y\) binary, absorbing state: transition OR \(\approx\) Cox HR (see Efron 1988)
- transition probability = discrete hazard rate
- daily events \(\rightarrow\) events have low probability
- \(\rightarrow\) OR \(\approx\) RR = HR
- transition probability = discrete hazard rate
- \(Y\) binary but no absorbing state: recurrent events
- sum of state occupancy probabilities (SOP): mean time in state
- cumulative sum of SOPs: cumulative mean function
- sum of state occupancy probabilities (SOP): mean time in state
5 The Model
- First order: current state depends only on covariates, previous state, gap
- Let measurement times be \(t_{1}, t_{2}, \dots, t_{m}\), and the measurement for a patient at time \(t\) be denoted \(Y(t)\) \[\Pr(Y(t_{i}) \geq y | X, Y(t_{i-1})) = \mathrm{expit}(\alpha_{y} + X\beta + g(Y(t_{i-1}), t_{i}, t_{i} - t_{i-1}))\]
- \(g\) involves any number of regression coefficients for a main effect of \(t\), the main effect of time gap \(t_{i} - t_{i-1}\) if this is not collinear with absolute time, a main effect of the previous state, and interactions between these.
5.1 Examples of how the previous state may be modeled in \(g\):
- Linear in numeric codes for \(Y\)
- Single binary indicator for a specific state such as the lowest or highest one
- Discontinuous bi-linear relationship where there is a slope for in-hospital outcome severity, a separate slope for outpatient outcome severity, and an intercept jump at the transition from inpatient to outpatient (or vice versa).
5.2 Most Important Effects to Include
- Previous state
- Flexible function of time \(t\) since randomization
- No time effect \(\sim\) constant hazard rate
- Non-proportional odds effects for \(t\)
- Mix of events can change over time, e.g., early ventilator use, late death
- \(t \times\) previous state interaction
- Example: Hospitalized patients more stable over time = increasing effect of previous state
- Flexible function of gap times (if gap times and absolute time are virtually collinear, one of these may be omitted)
- Interaction between previous state and gap time if gaps are very non-constant
- Interaction between time and treatment if treatment effect is delayed, etc.
5.3 PO Assumption for Treatment
- \(\rightarrow\) Tx has a single coefficient
- Information borrowed across all levels of \(Y\)
- Can relax the PO assumption with a PPO or constrained PPO model
- Frequentist model: no borrowing of information across relaxed categories (\(\rightarrow\) power loss)
- Bayesian: can put skeptical prior on the non-PO Tx terms to borrow some information
- Bayesian model allows one to compute Pr(treatment effect on mortality is more than \(\epsilon\) different than treatment effect on nonfatal outcomes)
5.4 From Transition Probabilities to State Occupancy Probabilities
- For equal time spacing:
\(\Pr(Y(t)=y | X) = \sum_{j=1}^{k}\Pr(Y(t)=y | X, Y(t-1) = j) \Pr(Y(t-1) = j | X)\) - Use this recursively
- Yields a semiparametric unconditional (except for \(X\)) distribution of \(Y\) at each \(t\) (SOPs)
soprobMarkovOrd*
functions in the RHmisc
package make this easy for frequentist and Bayesian models
5.5 Special Advantage of Bayesian Models and MCMC Posterior Sampling
- SOPs involve complex derived parameters for which frequentist CLs are very hard to derive
- Bayesian posterior distribution and uncertainty intervals derived from it are trivial to compute
- Example: 4,000 posterior draws from transition model’s basic parameters; compute 4,000 values of each derived parameter (SOP; mean time in state, etc.)
5.6 Estimands
- Transition odds ratios
- Prior state and covariate-specific transition probabilities
- Covariate-specific SOPs
- Time in state \(Y=y\)
- Time in states \(Y \geq y\) (e.g., mean time unwell)
- Differences in mean time in state between treatments
- …
- Agreement among posterior probabilities of efficacy \(> \epsilon\) under PO for ACTT-1
- Almost perfect if \(\epsilon=0\) (same decisions regardless of estimand)
- Lower if \(\epsilon > 0\) (just as ARR is covariate-dependent even when OR is not)
- Almost perfect if \(\epsilon=0\) (same decisions regardless of estimand)
5.7 Assessment of Fit
5.7.1 Right Hand Side of Model
- Relax linearity assumptions using regression splines
- Usual interaction issues
5.7.2 Proportional Odds for Treatment
- Look at variation of cutoff-specific treatment ORs
- More cohensive: specify a prior for the amount of non-PO in a PPO model
5.7.3 Correlation Structure
- Variogram
- Add random effects to model and show they have a small variance
- Checking dependence on previous state is just a model right-hand-side assessment
5.7.4 General
- Posterior predictive draws (data re-simulation) vs. raw data agreement
6 More Information
- COVID-19 statistical resources: hbiostat.org/proj/covid19 including detailed analyses of the VIOLET, ORCHID, and ACTT-1 studies plus an examination of the handling of irregular measurement times in a Markov model
- Markov modeling references: hbiostat.org/bib/markov.html
- Longitudinal ordinal analysis references: hbiostat.org/bib/ordSerial.html
- General references on longitudinal data analysis: hbiostat.org/bib/serial.html
- Introduction to the proportional odds model: hbiostat.org/bbr/md chapter 7
- More about the proportional odds model: hbiostat.org/rms
- Attributes of good outcome measures
7 Computing Environment
R version 4.1.0 (2021-05-18) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Pop!_OS 21.04 Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] rmsb_0.0.2 rms_6.2-1 SparseM_1.81 Hmisc_4.6-0 [5] ggplot2_3.3.3 Formula_1.2-4 survival_3.2-11 lattice_0.20-44To cite R in publications use:
R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
To cite the Hmisc package in publications use:Harrell Jr F (2021). Hmisc: Harrell Miscellaneous. R package version 4.6-0, https://hbiostat.org/R/Hmisc/.
To cite the rms package in publications use:Harrell Jr FE (2021). rms: Regression Modeling Strategies. https://hbiostat.org/R/rms/, https://github.com/harrelfe/rms.
To cite the rmsb package in publications use:Harrell F (2021). rmsb: Bayesian Regression Modeling Strategies. R package version 0.0.2, https://hbiostat.org/R/rmsb/.
To cite the ggplot2 package in publications use:Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
To cite the survival package in publications use:Therneau T (2021). A Package for Survival Analysis in R. R package version 3.2-11, https://CRAN.R-project.org/package=survival.