Examples of Markov Ordinal Modeling on Two Datasets

1 Introduction

This report provides analyses of two simulated datasets (available here and here) provided by Erica Brittain PhD, Mathematical Statistician and Deputy Branch Chief of the Biostatistics Research Branch of NIAID/NIH. In both datasets, the primary outcome of interest is recovery, death may occur before the patient has a chance to recover, and all patients who failed to recovered died, so there is no censoring in the usual sense. A first-order Markov partial proportional odds (PO) model is used to model time and treatment effects on a 3-level ordinal outcome assessed daily: Y=0 (recovered), 1 (not recovered but alive), 2 (dead). Y=0 and 2 are absorbing states. Of interest is a comparison of the treatment odds ratio from this discrete time state transition model with the hazard ratio from a time-to-recovery outcome using a Fine and Grey competing risk model. In such a model the cumulative incidence function is estimated in a special way but the Cox proportional hazards model hazard ratio is estimated using the original Cox partial likelihood with deaths counted as censored time to recovery at the last day of follow-up.

The ordinal longitudinal model used here has a linear time effect, either a single treatment parameter (log transition odds ratio) or event-specific log odds ratios to relax the PO assumption for treatment, and allows for the time effect to not be in proportional odds. Note that since there are only 3 Y levels, the partial PO model used here is the same as the constrained partial PO model. Partial and constrained partial PO models were developed by Peterson & Harrell 1990.

Note that the ordinal Markov models below can easily be fitted using a Bayesian approach using the R rmsb package. It is also important to note that

most studies are not adequately powered for detecting a pure mortality effect
if no PO assumption is made for treatment, the estimated mortality effect will be disconnected from the clinical recovery effect without any sharing of information, so standard errors will be increased and power reduced
perhaps the most cohesive approach is to put a Bayesian prior probability on the dissimilarity of treatment effects on recovery and death so that some borrowing of information across these outcomes can take place, preventing the required sample size from not being greatly increased
there is some initial indication that when integrative measures are used to assess treatment effectiveness (e.g., mean time dead or not recovered, which for our situation is the mean time with Y>0), the loss of power is less than one would get by estimating separate outcome-specific odds ratios

2 R Setup

outfmt <- if(knitr::is_html_output ()) 'html'  else 'pdf'
markup <- if(knitr::is_latex_output()) 'latex' else 'html'
require(rms)          # automatically engages rms(Hmisc)

require(data.table)   # used to compute various summary measures

require(VGAM)         # partial PO model

if(outfmt == 'html') require(plotly)

fdev <- switch(outfmt, html='png', pdf='pdf')
knitrSet(lang='markdown', w=7, h=7, dev=fdev, fig.path=paste0(fdev, '/d2-'))
options(prType=markup)
## If producing html, ggplot2 graphics are converted to plotly graphics
## so that hovertext will show extra information
## Assumes aes(..., label=txt) used
ggp <- if(outfmt != 'html')
  function(ggobject, ...) ggobject else Hmisc::ggplotlyr

# When outputting to pdf, make LaTeX place figures right at the point       
# in which they are created; requires float.sty (see yaml header)
if(outfmt == 'pdf') knitr::opts_chunk$set(fig.pos = 'H', out.extra='')

# Function in Hmisc that will sense chunk has results='asis' and
# print in Rmarkdown verbatim mode if needed
pr <- markupSpecs$markdown$pr

3 Dataset 1

In this dataset, both arms have exactly the same recovery times among those who recover, and the arms differ with respect to death times. The treatment arm has all early deaths, and the control arm has late deaths—but the same number of deaths. The simple logrank with Fine-Grey sees zero difference in these arms (pvalue is 1 and HR estimate is 1). Event=’REC’ means the subject recovered, and ‘DTH’ means they died.

3.1 Cox Model Analysis

Run an ordinary Cox model analysis after constructing a variable that censors death at the last follow-up time.

d <- csv.get('d1.csv')
setDT(d)
d[, ttr := ifelse(event == 'Rec', t, 30L)]
d[, ec  := ifelse(event == 'Rec', 1, 0)]
cox <- function(fg=FALSE) {
  # Jon Fintzi of NIAID/NIH provided code to run a Fine-Grey model
  # This will differ from the ordinary Cox model if there are any censored
  # observations for reasons other than death.   Set fg=TRUE if that pertains.
  if(fg) {
    require(survival)
    require(cmprsk2)
  }
  dd <- datadist(d); options(datadist=dd)
  f <- cph(Surv(ttr, ec) ~ treat, data=d)
  g <- if(fg) {
    dc <- d
    dc[, c('id', 't0', 't1', 'status') := list(1:.N, 0, t,
      factor(c(Rec=1, Dth=2)[event], 0:2, c('censor', 'recover', 'death')))]
    dfg <- finegray(Surv(time = t0, time2 = t1, status) ~ .,
                    data = dc, id = id)
    coxph(Surv(fgstart, fgstop, fgstatus) ~ treat,
          weight=fgwt,
          data=dfg, 
          ties="efron", 
          id = id)
  }
  z <- cox.zph(f, 'identity')
  pr('Test of PH', z)
  plot(z)

  list(cox=f, fg=g, hr=exp(coef(if(fg) g else f)))
}
w <- cox()

Test of PH 

       chisq df p
treat      0  1 1
GLOBAL     0  1 1

w$cox

Cox Proportional Hazards Model

 cph(formula = Surv(ttr, ec) ~ treat, data = d)

	Model Tests	Discrimination Indexes
Obs 140	LR χ² 0.00	R² 0.000
Events 100	d.f. 1	D_xy 0.000
Center 0	Pr(>χ²) 1.0000	g 0.000
	Score χ² 0.00	g_r 1.000
	Pr(>χ²) 1.0000

	β	S.E.	Wald Z	Pr(>\|Z\|)
treat	0.0000	0.2000	0.00	1.0000

if(length(w$fg)) pr('Fine-Grey Cox model', w$fg)

3.2 Partial PO Markov Model

A tall and thin dataset is constructed with daily Y=0,1,2. A second version of the tall and thin dataset carries recovery and death forward so that proportions of all daily state occupancies can easily be plotted.

revamp <- function() {
d[, id := 1:.N]
w <- d[, .(day   = 1 : t,
           y     = c(rep(1, t - 1), ifelse(event == 'Rec', 0L, 2L)),
           treat = rep(treat, t)),
       by=id]
wcf <- d[, .(day   = 1 : 30,
             y     = c(rep(1L, t - 1), 
                       rep(ifelse(event == 'Rec', 0L, 2L), 31L - t)),
             treat = paste('Treatment', rep(treat, 30))),
         by=id]
xl <- rep('', 30); at <- seq(5, 30, by=5)
xl[at] <- format(at)
setkey(wcf, treat, day)
prop <- wcf[, .(p=mean(y > 0)), by=.(treat, day)] 
mt <- prop[, .(mtu=sum(p)), by=treat]
mtu <- round(mt[, mtu], 2)
mtulab <- paste0('Mean time unrecovered for treatments 0, 1: ',
                  mtu[1], ', ', mtu[2])

print(propsPO(y ~ day + treat, data=wcf) +
  scale_x_discrete(breaks=1:30, labels=xl) +
  labs(caption='recovered: y=0\nalive, not recovered: y=1\ndead: y=2',
       title='State occupancy proportions', subtitle=mtulab))
w
}
w <- revamp()

Fit the Markov model, with and without assuming PO for treatment, and with and without including a treatment \(\times\) time interaction.

pvglm <- function(f) {
  se <- sqrt(diag(vcov(f)))
  print(cbind(beta=coef(f), se=se, z=coef(f) / se), digits=3)
  invisible()
}
# lrtest in VGAM did not use the right d.f. for partial PO models
lrt <- function(f1, f2) {
  chisq <- 2 * (logLik(f2) - logLik(f1))
  df    <- length(coef(f2)) - length(coef(f1))
  w <- data.frame('LR chi-square'=round(chisq, 2), 'd.f.'=df, p=round(1 - pchisq(chisq, df), 4))
  print(w, row.names=FALSE)
  invisible()
}
# Compute and plot state occupancy probabilities for both treatment groups, and plot
# Also compute mean time unrecovered, by treatment
sop <- function(fit, desc) {
  revo <- function(z) {
    z <- as.factor(z)
    factor(z, levels=rev(levels(as.factor(z))))
    }
  z <- NULL
  for(tx in 0 : 1) {
    s <- soprobMarkovOrdm(fit, data.frame(treat=tx, yprev=NA), times=1 : 30,
                          ylevels=0:2, absorb=c(0, 2), tvarname='day')
    mtu <- sum(1. - s[, 1])
    if(tx == 0) mtu0 <- mtu else mtu1 <- mtu
    u <- data.frame(day=as.vector(row(s)), y=as.vector(col(s) - 1), p=as.vector(s))
    u$treat <- tx
    z <- rbind(z, u)
  }
  mtulab <- paste0('Mean time unrecovered for treatments 0, 1: ',
                   round(mtu0, 2), ', ', round(mtu1, 2))
  g <- ggplot(z, aes(x=day, y=p, fill=revo(y))) +
    facet_wrap(~ paste('Treatment', treat), nrow=1) + geom_col() + ylab('Probability') +
    guides(fill=guide_legend(title='y')) +
    labs(caption='recovered: y=0\nalive, not recovered: y=1\ndead: y=2',
         title=paste('State occupancy proportions', desc),
         subtitle=mtulab)
  print(g)
}
  
mmod <- function() {
# Start with PO for treatment
cat('\nAssuming PO for treatment, no treatment x time interaction\n\n')
f <- vglm(y ~ day + treat,
          cumulative(reverse=TRUE, parallel = FALSE ~ day),
          data=w)
pvglm(f)
or <- exp(coef(f)['treat'])
pr('Transition odds ratio', or)
fpoa <- f
sop(f, 'for model with PO for tx, not for time, no tx x time interaction')

cat('\nAssuming PO for treatment, allowing treatment x time interaction\n\n')
f <- vglm(y ~ day * treat,
          cumulative(reverse=TRUE, parallel = FALSE ~ day),
          data=w)
pvglm(f)
k <- coef(f)
a <- k['treat']
b <- k['day:treat']
con <- a + b * c(5, 30)
or <- exp(coef(f)['treat'])
pr('Transition odds ratio at days 5 and 30', exp(con))
fpoia <- f
sop(f, 'for model with PO for tx, not for time, with tx x time interaction')

cat('\nNot assuming PO for treatment, no treatment x time interaction\n\n')
f <- vglm(y ~ day + treat,
          cumulative(reverse=TRUE, parallel = FALSE ~ day + treat),
          data=w)
pvglm(f)
or <- exp(coef(f)[c('treat:1', 'treat:2')])
pr('Transition odds ratios for recovery and for death', or)
fppoa <- f
sop(f, 'for model with non-PO for tx and time, no tx x time interaction')
pr('Test of PO for treatment without time x treatment', lrt(fpoa, fppoa))

cat('\nNot assuming PO for treatment, allowing treatment x time interaction\n\n')
f <- vglm(y ~ day * treat,
          cumulative(reverse=TRUE, parallel = FALSE ~ day * treat),
          data=w)
pvglm(f)
k <- coef(f)
a <- k[c('treat:1', 'treat:2')]
b <- k[c('day:treat:1', 'day:treat:2')]
for(i in 1 : 2) {
  con <- a[i] + b[i] * c(5, 30)
  pr(paste0('Transition odds ratio for Y>=', i, ' at days 5 and 30'), exp(con))
}
fppoia <- f
sop(f, 'for model with non-PO for tx and time, with tx x time interaction')
pr('Test of PO for treatment with time x treatment', lrt(fpoia, fppoia))
}
mmod()


Assuming PO for treatment, no treatment x time interaction

                beta     se       z
(Intercept):1  4.411 0.2549  17.308
(Intercept):2 -4.473 0.3314 -13.497
day:1         -0.114 0.0148  -7.678
day:2          0.057 0.0224   2.544
treat         -0.150 0.1809  -0.828

Transition odds ratio 

   treat 
0.860955


Assuming PO for treatment, allowing treatment x time interaction

                 beta     se      z
(Intercept):1  3.8060 0.2832  13.44
(Intercept):2 -5.1486 0.3985 -12.92
day:1         -0.0669 0.0194  -3.45
day:2          0.1022 0.0255   4.01
treat          1.1801 0.4031   2.93
day:treat     -0.0957 0.0256  -3.74

Transition odds ratio at days 5 and 30 

[1] 2.0166747 0.1841726


Not assuming PO for treatment, no treatment x time interaction

                 beta     se      z
(Intercept):1  4.5025 0.2625  17.15
(Intercept):2 -4.7068 0.3620 -13.00
day:1         -0.1133 0.0147  -7.68
day:2          0.0582 0.0221   2.63
treat:1       -0.3508 0.2096  -1.67
treat:2        0.3500 0.3199   1.09

Transition odds ratios for recovery and for death 

  treat:1   treat:2 
0.7041189 1.4190437


Test of PO for treatment without time x treatment 

 LR.chi.square d.f.      p
          3.48    1 0.0623

Not assuming PO for treatment, allowing treatment x time interaction

                  beta     se     z
(Intercept):1  4.46607 0.3532 12.64
(Intercept):2 -8.04051 0.8848 -9.09
day:1         -0.10898 0.0219 -4.98
day:2          0.25973 0.0427  6.09
treat:1       -0.32205 0.4806 -0.67
treat:2        5.27261 0.9518  5.54
day:treat:1   -0.00532 0.0296 -0.18
day:treat:2   -0.40039 0.0678 -5.90

Transition odds ratio for Y>=1 at days 5 and 30 

[1] 0.7056399 0.6177598

Transition odds ratio for Y>=2 at days 5 and 30 

[1] 26.328403513  0.001183626


Test of PO for treatment with time x treatment 

 LR.chi.square d.f. p
         39.29    2 0

We see a fairly strong non-PO for time (day) in both models. Consider the pair of models without treatment \(\times\) time interaction, The transition odds ratio in the model that assumes a common effect of treatment on recovery and on death indicates a slight net benefit for treatment 1. When the PO assumption is not made for treatment, the model estimates that treatment 1 improves recovery but hastens death. The regression coefficients are equal but in opposite directions. The statistical evidence for inconsistency of treatment effects from the above likelihood ratio \(\chi^2\) test comparing the two models is given by \(P=0.06\).

When time \(\times\) treatment interaction is included, there is strong evidence for such interactions, and in the model that also allowed for a different effect on mortality, there is a huge time \(\times\) treatment interaction. Though in the simulated dataset the time-related treatment effect for mortality is very discontinuous, allowing for a linear time interaction much better captures the sudden mortality increase at day 5 for treatment 1 than when the time interaction is excluded altogether.

4 Dataset 2

The second data set is more realistic. This set up assumes 30 day follow up. The logrank/Fine-Grey (i.e., all deaths are analyzed as if censored at 30 days) for the second data set has a p-value of .0225 and HR estimate of 1.441. Those numbers are not exactly reproduced below, because we have dropped three patients whose endpoint occurred at time 0.

4.1 Cox Model Analysis

First we run a Cox model analysis after constructing a variable that censors death at the last follow-up time.

d <- csv.get('d2.csv')
setDT(d)
d <- d[t > 0, ]
d[, ttr := ifelse(event == 'Rec', t, 30L)]
d[, ec  := ifelse(event == 'Rec', 1, 0)]
w <- cox()

Test of PH 

       chisq df     p
treat   8.29  1 0.004
GLOBAL  8.29  1 0.004

w$cox

Cox Proportional Hazards Model

 cph(formula = Surv(ttr, ec) ~ treat, data = d)

	Model Tests	Discrimination Indexes
Obs 187	LR χ² 4.31	R² 0.023
Events 158	d.f. 1	D_xy 0.123
Center 0.1668	Pr(>χ²) 0.0379	g 0.169
	Score χ² 4.37	g_r 1.184
	Pr(>χ²) 0.0366

	β	S.E.	Wald Z	Pr(>\|Z\|)
treat	0.3354	0.1612	2.08	0.0374

if(length(w$fg)) pr('Fine-Grey Cox model', w$fg)
hr <- w$hr

The reciprocal of the time to recovery HR is 0.715, indicating that treatment 1 has shorter time to recovery than treatment 0.

4.2 Partial PO Markov Model

Fit the Markov model, with and without assuming PO for treatment.

w <- revamp()
mmod()


Assuming PO for treatment, no treatment x time interaction

                beta     se      z
(Intercept):1  3.837 0.2162  17.75
(Intercept):2 -3.144 0.2968 -10.59
day:1         -0.096 0.0129  -7.46
day:2         -0.122 0.0401  -3.03
treat         -0.774 0.1737  -4.45

Transition odds ratio 

    treat 
0.4612666


Assuming PO for treatment, allowing treatment x time interaction

                 beta     se       z
(Intercept):1  3.6478 0.2242  16.269
(Intercept):2 -3.3251 0.3096 -10.741
day:1         -0.0815 0.0143  -5.699
day:2         -0.1065 0.0399  -2.666
treat         -0.1824 0.3066  -0.595
day:treat     -0.0633 0.0271  -2.333

Transition odds ratio at days 5 and 30 

[1] 0.6071849 0.1247370


Not assuming PO for treatment, no treatment x time interaction

                beta     se      z
(Intercept):1  4.043 0.2364 17.106
(Intercept):2 -3.637 0.3718 -9.780
day:1         -0.104 0.0134 -7.746
day:2         -0.106 0.0413 -2.577
treat:1       -1.009 0.1920 -5.255
treat:2        0.257 0.3786  0.678

Transition odds ratios for recovery and for death 

  treat:1   treat:2 
0.3646082 1.2926565


Test of PO for treatment without time x treatment 

 LR.chi.square d.f.      p
          8.86    1 0.0029

Not assuming PO for treatment, allowing treatment x time interaction

                 beta     se      z
(Intercept):1  3.8214 0.2563 14.910
(Intercept):2 -3.4905 0.4129 -8.453
day:1         -0.0890 0.0156 -5.709
day:2         -0.1313 0.0558 -2.352
treat:1       -0.4631 0.3639 -1.273
treat:2       -0.0868 0.6034 -0.144
day:treat:1   -0.0517 0.0296 -1.745
day:treat:2    0.0641 0.0856  0.748

Transition odds ratio for Y>=1 at days 5 and 30 

[1] 0.485938 0.133353

Transition odds ratio for Y>=2 at days 5 and 30 

[1] 1.263015 6.263780


Test of PO for treatment with time x treatment 

 LR.chi.square d.f.      p
          7.17    2 0.0278

We see less strong non-PO for time (day) in these models. Consider the models excluding treatment \(\times\) time interaction. The transition odds ratio in the model that assumes a common effect of treatment on recovery and on death indicates a strong net benefit for treatment 1. When the PO assumption is not made for treatment, the model estimates that treatment 1 improves recovery even more strongly, but it weakly hastens death (\(P=0.68\)). The statistical evidence for inconsistency of treatment effects from the above likelihood ratio \(\chi^2\) test comparing the two models is given by \(P=0.003\).

There is some evidence for time \(\times\) treatment interaction in the model that assumes PO for treatment. This is less pronounced in the more general model.

5 Bayesian Modeling Using Dataset 2

5.1 No Time \(\times\) Treatment Interaction

We repeat the last model fitted in the previous section that excluded time \(\times\) treatment iteraction, but now using a Bayesian MCMC procedure. This will allow us to easily get uncertainties about any derived quantities of interest, including the mean number of days unrecovered, i.e., the mean number of days over a 30-day observation period in which a patient is sick or dead. Note that the blrm function parameterizes partial PO effects differently than vglm. blrm terms for departures from PO are increments in the log odds over and above the main effect for the variable in question. In this setting the departure of interest represents the extra treatment effect on death.

stanSet()    # use maximum number of CPU cores

require(rmsb)

set.seed(1)
dd <- datadist(w); options(datadist='dd')
# The following took 30s
bmark2 <- blrm(y ~ day + treat, ~ day + treat, cppo = function(y) y, 
               data=w, file='bmark2.rds')
bmark2

Bayesian Constrained Partial Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.541 for Intercepts

 blrm(formula = y ~ day + treat, ppo = ~day + treat, cppo = function(y) y, 
     data = w, file = "bmark2.rds")

	Mixed Calibration/ Discrimination Indexes	Discrimination Indexes	Rank Discrim. Indexes
Obs 2102	B 0.066 [0.066, 0.067]	g 0.836 [0.659, 1.052]	C 0.663 [0.659, 0.664]
0 158		g_p 0.054 [0.042, 0.067]	D_xy 0.327 [0.319, 0.329]
1 1915		EV 0.037 [0.019, 0.053]
2 29		v 0.539 [0.287, 0.786]
Draws 4000		vp 0.003 [0.001, 0.004]
Chains 4
p 2

	Mode β	Mean β	Median β	S.E.	Lower	Upper	Pr(β>0)	Symmetry
y≥1	4.0440	4.0430	4.0395	0.2344	3.5996	4.5065	1.0000	1.05
y≥2	-3.6364	-3.6567	-3.6445	0.3831	-4.4401	-2.9684	0.0000	0.93
day	-0.1037	-0.1033	-0.1033	0.0145	-0.1322	-0.0757	0.0000	0.98
treat	-1.0871	-1.0836	-1.0817	0.2062	-1.4806	-0.6779	0.0000	1.00
day x f(y)	-0.0007	-0.0012	-0.0006	0.0129	-0.0242	0.0250	0.4805	0.87
treat x f(y)	0.3696	0.3673	0.3685	0.1253	0.1236	0.6141	0.9982	0.94

There is strong evidence for a different treatment effect on death than for recovery (posterior probability of this effect is 0.9982 under flat priors). There is little evidence for non-PO with respect to time (posterior probability 0.48). Compute the two odds ratios for treatment. The first, for recovery, using a cut of \(y \geq 1\) and the second, for mortality, uses a cut of \(y \geq 2\) which here is the same as \(y=2\). Here we are only looking at posterior mean parameter values.

summary(bmark2, ycut=1)

	Low	High	Δ	Effect	S.E.	Lower 0.95	Upper 0.95
Effects Response: `y`
day	4	13	9	-0.9322	0.1203	-1.1650	-0.7010
Odds Ratio	4	13	9	0.3937		0.3120	0.4961
treat	0	1	1	-1.0060	0.1936	-1.3740	-0.6224
Odds Ratio	0	1	1	0.3655		0.2531	0.5367

summary(bmark2, ycut=2)

	Low	High	Δ	Effect	S.E.	Lower 0.95	Upper 0.95
Effects Response: `y`
day	4	13	9	-0.9690	0.3742	-1.6980	-0.2599
Odds Ratio	4	13	9	0.3795		0.1831	0.7711
treat	0	1	1	0.2517	0.3890	-0.4620	1.0610
Odds Ratio	0	1	1	1.2860		0.6300	2.8890

The posterior mean odds ratios for the two events are 0.37 and 1.29, very close to the maximum likelihood estimates in the earlier frequentist analysis. Note the great uncertainty in the treatment effect on mortality (0.95 highest posterior density interval for the OR is 0.63 to 2.89).

5.2 With Time \(\times\) Treatment Interaction

Now the model is generalized to allow for time \(\times\) treatment interaction.

set.seed(2)
# The following took 33s
bmark2ia <- blrm(y ~ day * treat, ~ day * treat, cppo = function(y) y, 
                 data=w, file='bmark2ia.rds')
bmark2ia

Bayesian Constrained Partial Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.541 for Intercepts

 blrm(formula = y ~ day * treat, ppo = ~day * treat, cppo = function(y) y, 
     data = w, file = "bmark2ia.rds")

	Mixed Calibration/ Discrimination Indexes	Discrimination Indexes	Rank Discrim. Indexes
Obs 2102	B 0.066 [0.065, 0.066]	g 0.772 [0.552, 0.933]	C 0.659 [0.652, 0.665]
0 158		g_p 0.054 [0.041, 0.067]	D_xy 0.318 [0.304, 0.329]
1 1915		EV 0.042 [0.022, 0.063]
2 29		v 0.473 [0.269, 0.694]
Draws 4000		vp 0.003 [0.002, 0.004]
Chains 4
p 3

	Mode β	Mean β	Median β	S.E.	Lower	Upper	Pr(β>0)	Symmetry
y≥1	3.8189	3.8291	3.8210	0.2551	3.3493	4.3340	1.0000	1.10
y≥2	-3.4902	-3.5024	-3.4918	0.4094	-4.3694	-2.7430	0.0000	0.89
day	-0.0863	-0.0860	-0.0858	0.0168	-0.1179	-0.0522	0.0000	0.99
treat	-0.4849	-0.4893	-0.4844	0.3748	-1.2156	0.2421	0.0938	0.98
day × treat	-0.0587	-0.0585	-0.0587	0.0306	-0.1183	-0.0012	0.0262	1.01
day x f(y)	-0.0124	-0.0140	-0.0132	0.0170	-0.0494	0.0174	0.2095	0.84
treat x f(y)	0.1094	0.1097	0.1093	0.1999	-0.2676	0.5184	0.7090	1.04
day × treat x f(y)	0.0337	0.0334	0.0333	0.0261	-0.0176	0.0834	0.9002	1.02

Since the degree of non-PO for treatment was allowed to vary with time, there is no single posterior probability for the non-PO treatment effect. Let’s compute time-specific posterior probabilities of inconsistent recovery and death effects. This is done by computing the proportion of 4000 posterior draws for which the day-specific treatment effect on (non-)recovery (y=1) is less than the effect on death (y=2).

k <- contrast(bmark2ia, list(treat=1, day=1:30), list(treat=0, day=1:30), ycut=1)$cdraws -
     contrast(bmark2ia, list(treat=1, day=1:30), list(treat=0, day=1:30), ycut=2)$cdraws
p <- apply(k, 2, function(x) mean(x < 0))
ggplot(data.frame(day=1:30, p=p), aes(x=day, y=p)) + geom_line() +
  xlab('Day') + ylab('P(treatment effect greater for death than for recovery)')

5.3 Analysis of Mean Time Unrecovered Accounting for Uncertainties

A significant advantage of an integrated summary such as mean time unrecovered (time with Y > 0) is that such measures may be interpreted easily no matter how complex the model, and one does not need to choose a time horizon, assuming that all follow-up time intervals are equally important. So if a model contains differential (non-PO) treatment effects for different events and/or contains time \(\times\) treatment interactions as our model does, one can still compute the integrated summary. A choice of follow-up time does not need to be made, but one must specify the states whose occupancy probabilities are being summed. Under non-PO for treatment, patterns of mean time unrecovered may be dissimilar to patterms for mean time alive.

Compute for each treatment the posterior distribution of the mean time unrecovered. The mean time unrecovered is the sum over the 30 days of P(Y > 0) = 1 - P(Y = 0). Compute this sum for each of the 4000 posterior samples to reflect uncertainties in occupancy probabilities. Note that as long as time intervals are regular, mean time in state has no problem with discrete time, whereas the median because of poor handling of ties requires measurement resolution in hours in order to reliably estimate median in days.

mtu <- matrix(NA, nrow=4000, ncol=3, 
              dimnames=list(NULL, c('Treatment 0', 'Treatment 1', 'Delta')))
for(tx in 0:1) {
  # s is a 4000 (posterior draws) x 30 (days) x 3 (y levels) array
  s <- soprobMarkovOrdm(bmark2ia, data.frame(treat=tx, yprev=0), times=1:30, ylevels=0:2, absorb=c(0,2),
                        tvarname='day')
  s <- 1. - s[, , 1]    # P(Y > 0)  4000 x 30
  mtu[, tx + 1] <- rowSums(s)
}
mtu[, 3] <- mtu[, 2] - mtu[, 1]
g <- function(x) c(Mean=mean(x), HPDint(x))
apply(mtu, 2, g)

      Treatment 0 Treatment 1     Delta
Mean     16.61873    12.10765 -4.511077
Lower    14.71417    10.33961 -7.143029
Upper    18.60649    14.11551 -1.605320

delta <- mtu[, 3]
# Compute P(treatment 1 lowers number of unrecovered days)
mean(delta < 0)

[1] 0.99975

# Compute simulation error in this calculation
binconf(sum(delta < 0), length(delta))   # Wilson 0.95 confidence interval

 PointEst     Lower     Upper
  0.99975 0.9985852 0.9999872

# Prob at least two days lower
mean(delta < -2)

[1] 0.95725

The posterior mean of the difference in expected number of days unrecovered is -4.5 in favor of treatment 1, with 0.95 highest posterior density uncertainty interval of [-7.1, -1.6]. The posterior distribution of the differences is below. The posterior probability that treatment 1 lowers time unrecovered by some amount is 0.99975 with an MCMC simulation error of 0.001, given flat priors. The probability that it lowers time unrecovered by at least 2d is 0.96.

ggplot(data.frame(x=delta), aes(x)) + geom_density() +
  xlab('Treatment 1 - Treatment 0 Mean # Days Unrecovered') + ylab('')

6 Computing Environment

 R version 4.1.0 (2021-05-18)
 Platform: x86_64-pc-linux-gnu (64-bit)
 Running under: Pop!_OS 21.04
 
 Matrix products: default
 BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
 
 attached base packages:
 [1] splines   stats4    stats     graphics  grDevices utils     datasets 
 [8] methods   base     
 
 other attached packages:
  [1] rmsb_0.0.2           rstan_2.21.2         StanHeaders_2.21.0-7
  [4] plotly_4.9.3         VGAM_1.1-5           data.table_1.13.6   
  [7] rms_6.2-1            SparseM_1.81         Hmisc_4.6-0         
 [10] ggplot2_3.3.3        Formula_1.2-4        survival_3.2-11     
 [13] lattice_0.20-44

To cite R in publications use:

R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.

To cite the Hmisc package in publications use:

Harrell Jr F (2021). Hmisc: Harrell Miscellaneous. R package version 4.6-0, https://hbiostat.org/R/Hmisc/.

To cite the rms package in publications use:

Harrell Jr FE (2021). rms: Regression Modeling Strategies. https://hbiostat.org/R/rms/, https://github.com/harrelfe/rms.

To cite the rmsb package in publications use:

Harrell F (2021). rmsb: Bayesian Regression Modeling Strategies. R package version 0.0.2, https://hbiostat.org/R/rmsb/.

To cite the VGAM package in publications use:

Yee TW (2015). Vector Generalized Linear and Additive Models: With an Implementation in R. Springer, New York, USA.

To cite the data.table package in publications use:

Dowle M, Srinivasan A (2020). data.table: Extension of 'data.frame'. R package version 1.13.6, https://CRAN.R-project.org/package=data.table.

To cite the ggplot2 package in publications use:

Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.

To cite the rstan package in publications use:

Stan Development Team (2020). “RStan: the R interface to Stan.” R package version 2.21.2, http://mc-stan.org/.

To cite the survival package in publications use:

Therneau T (2021). A Package for Survival Analysis in R. R package version 3.2-11, https://CRAN.R-project.org/package=survival.