Markov Longitudinal Ordinal Model

1 Introduction

1.1 Setting and Goals

  • Setting: Study with daily/weekly/monthly binary, ordinal, or continuous outcome assessments \(Y\)
  • Understanding time course of treatment
  • Avoid picking a single day
  • Stay close to the raw data; avoid data reduction
    • Exception: multiple outcomes on a single day; use worst
    • If two events commonly occur on the same day can make a new category that is worse than either event alone (e.g., ventilator + rescue medication)
  • Increase power by considering “class calls” instead of reducing data to time to a specific condition
  • Not have to decide whether an early milder event is worse than a later major event
  • Allow for complex correlation structures including absorbing states
  • Maximizing power
  • Handling missing data
    • Time-to-event analysis is challenged by missing days of assessment
      • Longitudinal analysis can handle this (if MAR) as well as partial information
        • one range with the ordinal scale may be missing
          • if no clinical event overrides that day, treat the unassessed scale interval as interval censored
          • if clinical event override present, missing assessments irrelevant

1.2 Example of Power Increase

  • VIOLET trial from the PETAL network, NHLBI
  • Vitamin D in critically ill adults
  • Daily assessments for 28d on 4-level ordinal \(Y\): home, hospitalized, vent/ARDS, dead
  • See here for details and code
  • For transition OR=0.8, power of Cox test for time to recovery 0.2, power of Markov PO model using all daily assessments 0.9.
  • Efficiency gains from using more days (1 to all 28) of measurements:

2 Advantages of Discrete Time Markov Models

  • One method for binary, ordinal, nominal, continuous \(Y\)
  • General correlation structures, including nearly absorbing states
    • Patients in hospital for many days tend to change less from day-to-day
      • Previous state \(\times\) time interaction
  • Correlations are on the data scale so no limits on how high within-pt correlations can be
  • Multi-state models yield readily interpreted estimands included a host of derived estimands
    • Easier to interpret than competing risk analysis; death is an explicit outcome
    • More work needed if early deaths are not to be counted as the worst outcomes (e.g., if known to be caused by late-stage disease and not treatment)

3 Alternatives

  • Marginal ordinal transition models (Lee and Daniels 2007, Schildcrout 2021)
    • More complex (dual models), requires customized programming, computation time longer
      • Easy to interpret results
  • Mixed effects models
    • Easy to implement (especially in a Bayesian model) and fairly easy to interpret results (conditional on patient but not on previous state)
    • Cannot handle absorbing or nearly absorbing states
      • Unrealistic correlation structure assumption can destroy operating characteristics
  • See hbiostat.org/R/rmsb/notes.html

4 Markov Proportional Odds Model

  • Restrict attention to first-order models for now
  • Can use the Peterson & Harrell (1990) partial proportional odds model to relax the PO assumption
    • Frequentist: R VGAM package (constrained and unconstrained PPO model)
      • Bayesian: R rmsb package (constrained PPO model and interval censoring of \(Y\) at a given time)
  • When PO is assumed uses only standard software
    • Can efficiently handle > 6000 distinct \(Y\) values if use R rms package orm function, i.e., can handle continuous \(Y\)
  • Has a number of special cases
    • Single time: ordinary PO or PPO model
    • \(Y\) binary, absorbing state: transition OR \(\approx\) Cox HR (see Efron 1988)
      • transition probability = discrete hazard rate
        • daily events \(\rightarrow\) events have low probability
        • \(\rightarrow\) OR \(\approx\) RR = HR
    • \(Y\) binary but no absorbing state: recurrent events
      • sum of state occupancy probabilities (SOP): mean time in state
        • cumulative sum of SOPs: cumulative mean function

4.1 Quality of Fit in VIOLET

  • Second-Order Fit: Variogram

5 The Model

  • First order: current state depends only on covariates, previous state, gap
  • Let measurement times be \(t_{1}, t_{2}, \dots, t_{m}\), and the measurement for a patient at time \(t\) be denoted \(Y(t)\) \[\Pr(Y(t_{i}) \geq y | X, Y(t_{i-1})) = \mathrm{expit}(\alpha_{y} + X\beta + g(Y(t_{i-1}), t_{i}, t_{i} - t_{i-1}))\]
  • \(g\) involves any number of regression coefficients for a main effect of \(t\), the main effect of time gap \(t_{i} - t_{i-1}\) if this is not collinear with absolute time, a main effect of the previous state, and interactions between these.

5.1 Examples of how the previous state may be modeled in \(g\):

  • Linear in numeric codes for \(Y\)
  • Single binary indicator for a specific state such as the lowest or highest one
  • Discontinuous bi-linear relationship where there is a slope for in-hospital outcome severity, a separate slope for outpatient outcome severity, and an intercept jump at the transition from inpatient to outpatient (or vice versa).

5.2 Most Important Effects to Include

  • Previous state
  • Flexible function of time \(t\) since randomization
    • No time effect \(\sim\) constant hazard rate
  • Non-proportional odds effects for \(t\)
    • Mix of events can change over time, e.g., early ventilator use, late death
  • \(t \times\) previous state interaction
    • Example: Hospitalized patients more stable over time = increasing effect of previous state
  • Flexible function of gap times (if gap times and absolute time are virtually collinear, one of these may be omitted)
  • Interaction between previous state and gap time if gaps are very non-constant
  • Interaction between time and treatment if treatment effect is delayed, etc.

5.3 PO Assumption for Treatment

  • \(\rightarrow\) Tx has a single coefficient
  • Information borrowed across all levels of \(Y\)
  • Can relax the PO assumption with a PPO or constrained PPO model
  • Frequentist model: no borrowing of information across relaxed categories (\(\rightarrow\) power loss)
  • Bayesian: can put skeptical prior on the non-PO Tx terms to borrow some information
  • Bayesian model allows one to compute Pr(treatment effect on mortality is more than \(\epsilon\) different than treatment effect on nonfatal outcomes)

5.4 From Transition Probabilities to State Occupancy Probabilities

  • For equal time spacing:
    \(\Pr(Y(t)=y | X) = \sum_{j=1}^{k}\Pr(Y(t)=y | X, Y(t-1) = j) \Pr(Y(t-1) = j | X)\)
  • Use this recursively
  • Yields a semiparametric unconditional (except for \(X\)) distribution of \(Y\) at each \(t\) (SOPs)
  • soprobMarkovOrd* functions in the R Hmisc package make this easy for frequentist and Bayesian models

5.5 Special Advantage of Bayesian Models and MCMC Posterior Sampling

  • SOPs involve complex derived parameters for which frequentist CLs are very hard to derive
  • Bayesian posterior distribution and uncertainty intervals derived from it are trivial to compute
  • Example: 4,000 posterior draws from transition model’s basic parameters; compute 4,000 values of each derived parameter (SOP; mean time in state, etc.)

5.6 Estimands

  • Transition odds ratios
  • Prior state and covariate-specific transition probabilities
  • Covariate-specific SOPs
  • Time in state \(Y=y\)
  • Time in states \(Y \geq y\) (e.g., mean time unwell)
  • Differences in mean time in state between treatments
  • Agreement among posterior probabilities of efficacy \(> \epsilon\) under PO for ACTT-1
    • Almost perfect if \(\epsilon=0\) (same decisions regardless of estimand)
      • Lower if \(\epsilon > 0\) (just as ARR is covariate-dependent even when OR is not)

5.7 Assessment of Fit

5.7.1 Right Hand Side of Model

  • Relax linearity assumptions using regression splines
  • Usual interaction issues

5.7.2 Proportional Odds for Treatment

  • Look at variation of cutoff-specific treatment ORs
  • More cohensive: specify a prior for the amount of non-PO in a PPO model

5.7.3 Correlation Structure

  • Variogram
  • Add random effects to model and show they have a small variance
  • Checking dependence on previous state is just a model right-hand-side assessment

5.7.4 General

  • Posterior predictive draws (data re-simulation) vs. raw data agreement

6 More Information

7 Computing Environment

 R version 4.1.0 (2021-05-18)
 Platform: x86_64-pc-linux-gnu (64-bit)
 Running under: Pop!_OS 21.04
 
 Matrix products: default
 BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
 
 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base     
 
 other attached packages:
 [1] rmsb_0.0.2      rms_6.2-1       SparseM_1.81    Hmisc_4.6-0    
 [5] ggplot2_3.3.3   Formula_1.2-4   survival_3.2-11 lattice_0.20-44
 
To cite R in publications use:

R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.

To cite the Hmisc package in publications use:

Harrell Jr F (2021). Hmisc: Harrell Miscellaneous. R package version 4.6-0, https://hbiostat.org/R/Hmisc/.

To cite the rms package in publications use:

Harrell Jr FE (2021). rms: Regression Modeling Strategies. https://hbiostat.org/R/rms/, https://github.com/harrelfe/rms.

To cite the rmsb package in publications use:

Harrell F (2021). rmsb: Bayesian Regression Modeling Strategies. R package version 0.0.2, https://hbiostat.org/R/rmsb/.

To cite the ggplot2 package in publications use:

Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.

To cite the survival package in publications use:

Therneau T (2021). A Package for Survival Analysis in R. R package version 3.2-11, https://CRAN.R-project.org/package=survival.