Longitudinal Ordinal Analysis for ORCHID
Introduction
ORCHID was a randomized clinical trial of Hydroxychloroquine vs. placebo for treatment of hospitalized COVID-19 patients. The main publication is here. The primary analysis for the trial used a covariate-adjusted proportional odds model for the 7-level WHO COVID ordinal scale measured at 14 days. There was little evidence for any effectiveness of the drug, despite the fact that the U.S. stockpiled 63,000,000 doses of it. The purpose of the reanalyses done here is to learn from a longitudinal analysis using all available ordinal outcomes. These are assessed at days 1, 2, 3, 4, 7, 14, 28 after randomization. It will be seen that even with this more powerful longitudinal analysis, there remains very little evidence for drug effectiveness.
After much experimentation with various longitudinal models (and also see here), a first-order Markov state transition model for ordinal outcome states provides the best fit to the correlation structure found in the data. The Markov model uses the previous ordinal state as a (categorical) predictor of the current state. It allows the correlation between states on two successive days to be arbitrarily high within patient, and it elegantly handles absorbing states such as death. Excellent handling of the correlation structure is done without the complexities of random effects. See below for justification of this statement. Go here for a list of papers about Markov longitudinal modeling of binary and ordinal outcomes.
Markov models in their usual form assume that the outcome variable is also assessed at baseline (time zero). Such is the case for ORCHID. The fact that some of the outcome levels (e.g., death) are not used at baseline is of no consequence.
The only obstacle to using transition models is that the primary model parameters (here, in a proportional odds model) relate to the probability of being in the current state or higher given the previous state. For day 1 post randomization, this is no problem. But we need to restate the model to obtain time-unconditional probabilities of being in state y or worse (“state occupancy probabilities”). This is done through recursive (through time) matrix multiplications to “uncondition” the transition probabilities to get probabilities of being in the current state regardless of previous states. This are simple to interpret quantities that are functions of all model parameters. Because of that it is difficult to get confidence intervals and statistical tests using frequentist methods. In examples below it is seen that the Bayesian approach provides exact inference, taking all uncertainties into account. Graphs will show the probability of the patient being at home as a function of time and treatment. Home is almost, but not quite, an absorbing state. So it is possible that such graphs will increase up to a point then slightly decrease (that didn’t happen in this study). If we consider home to be an absorbing state, then the graph is a cumulative incidence graph.
Various ways of restating the model results are also given below. One of these is the mean restricted time to the patient going home.
Note that the transition model needs to take absolute time (days from randomization) into account, as well as previous state. That is because the mix of outcomes changes over time in a way that simple transition odds ratios can’t capture.
To be added to this report is an example of using the fitted model to simulate longitudinal ordinal data, and using the simulation to estimate probabilities of more complex quantities than can easily be obtained by the analytic solutions shown here. For example, one can simulate data to compute the probability that patients on a given treatment with given baseline covariates will be alive on day 28 if they are on a ventilator on day 14.
require(rmsb)require(data.table)require(plotly)knitrSet(lang='markdown', w=7, h=7, dev='png', fig.path='png/a-')
options(prType='html')
# Get around a bug in tooltip construction with ggplotly
# See https://stackoverflow.com/questions/66316337
ggp <- function(obj) {
  g <- ggplotly(obj, tooltip='label')
    d <- g$x$data
    for(i in 1 : length(d)) {
      w <- d[[i]]$text
    if(length(w)) d[[i]]$text <- gsub('txt: ', '', w)
        }
    g$x$data <- d
    g
    }
    
d <- readRDS('source/orchid.rds')
d[, gap := day - shift(day), by=id]
d[is.na(gap), gap := 1]
label(d$gap) <- 'Days since last measurement'
d[, table(day, gap)]##     gap
## day    1   3   7  14
##   1  479   0   0   0
##   2  479   0   0   0
##   3  478   0   0   0
##   4  475   0   0   0
##   7    0 474   0   0
##   14   0   0 461   0
##   28   0   0   0 447# Also fetch ORCHID dataset for all days, carrying death forward
dall <- readRDS('source/orchidAllDays.rds')
ordLevels <-
    c('death',
    'hospitalized on invasive mechanical ventilation/ECMO',
    'hospitalized on non-invasive ventilation or high flow nasal cannula',
    'hospitalized on supplemental oxygen',
    'hospitalized not on supplemental oxygen',
    'not hospitalized with limitation of activity',
    'not hospitalized without limitation in activity')Descriptive Statistics
html(describe(d))18 Variables 3293 Observations
id: Subject ID
| n | missing | distinct | 
|---|---|---|
| 3293 | 0 | 479 | 
| lowest : | ORCH-0001 | ORCH-0002 | ORCH-0003 | ORCH-0004 | ORCH-0005 | 
| highest: | ORCH-0475 | ORCH-0476 | ORCH-0477 | ORCH-0478 | ORCH-0479 | 
age
| n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3293 | 0 | 70 | 1 | 56.51 | 18.67 | 29 | 35 | 44 | 57 | 68 | 78 | 85 | 
icu
| n | missing | distinct | Info | Sum | Mean | Gmd | 
|---|---|---|---|---|---|---|
| 3293 | 0 | 2 | 0.456 | 616 | 0.1871 | 0.3042 | 
bmi: Body Mass Index
| n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3060 | 233 | 396 | 1 | 32.8 | 10.01 | 20.80 | 22.68 | 26.84 | 31.14 | 36.93 | 44.46 | 50.22 | 
| lowest : | 7.098765 | 8.650519 | 11.428571 | 17.689789 | 17.968750 | 
| highest: | 67.520776 | 67.827191 | 75.054815 | 78.694069 | 96.559285 | 
y0: Baseline status
| n | missing | distinct | Info | Mean | Gmd | 
|---|---|---|---|---|---|
| 3293 | 0 | 4 | 0.85 | 4.114 | 0.8686 | 
Value 2 3 4 5 Frequency 214 364 1549 1166 Proportion 0.065 0.111 0.470 0.354
deathday: Days from randomization to death
| n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 318 | 2975 | 25 | 0.996 | 15.38 | 9.578 | 5 | 6 | 8 | 15 | 20 | 28 | 30 | 
pfratio: PaO2 / FiO2
| n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 403 | 2890 | 53 | 0.999 | 143.8 | 94 | 45.00 | 53.33 | 80.00 | 127.50 | 220.00 | 267.86 | 295.24 | 
| lowest : | 27.36842 | 30.00000 | 44.00000 | 45.00000 | 50.00000 | 
| highest: | 290.47619 | 295.23810 | 322.50000 | 343.33333 | 438.09524 | 
race
| n | missing | distinct | 
|---|---|---|
| 3293 | 0 | 3 | 
Value black white other Frequency 789 1451 1053 Proportion 0.240 0.441 0.320
sfratio
| n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3125 | 168 | 161 | 0.999 | 348.6 | 118.6 | 124.3 | 178.0 | 283.3 | 383.3 | 438.1 | 452.4 | 461.9 | 
| lowest : | 0.9666667 | 30.0000000 | 69.0000000 | 78.0000000 | 79.0000000 | 
| highest: | 457.1428571 | 461.9047619 | 466.6666667 | 471.4285714 | 476.1904762 | 
sofa_gcs
| n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3293 | 0 | 17 | 0.963 | 3.088 | 2.658 | 0 | 1 | 1 | 2 | 4 | 6 | 9 | 
 Value          0     1     2     3     4     5     6     7     8     9    10    11
 Frequency    189   741   891   503   390   182    84    67    33    61    25    14
 Proportion 0.057 0.225 0.271 0.153 0.118 0.055 0.026 0.020 0.010 0.019 0.008 0.004
                                         
 Value         12    13    14    16    17
 Frequency     54    33    12     7     7
 Proportion 0.016 0.010 0.004 0.002 0.002
 
 sofa_nogcs
| n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3293 | 0 | 13 | 0.959 | 2.773 | 2.188 | 0 | 1 | 1 | 2 | 4 | 6 | 7 | 
 Value          0     1     2     3     4     5     6     7     8     9    11    13
 Frequency    189   783   918   539   359   165    82   110    47    75    12     7
 Proportion 0.057 0.238 0.279 0.164 0.109 0.050 0.025 0.033 0.014 0.023 0.004 0.002
                 
 Value         14
 Frequency      7
 Proportion 0.002
 
 time_to_recovery
| n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3293 | 0 | 29 | 0.98 | 9.132 | 10.45 | 0 | 0 | 0 | 5 | 15 | 28 | 28 | 
sex
| n | missing | distinct | 
|---|---|---|
| 3293 | 0 | 2 | 
Value female male Frequency 1448 1845 Proportion 0.44 0.56
trt: Treatment
| n | missing | distinct | 
|---|---|---|
| 3293 | 0 | 2 | 
Value placebo hydroxychloroquine Frequency 1631 1662 Proportion 0.495 0.505
day: Days since randomization
| n | missing | distinct | Info | Mean | Gmd | 
|---|---|---|---|---|---|
| 3293 | 0 | 7 | 0.98 | 8.217 | 8.698 | 
Value 1 2 3 4 7 14 28 Frequency 479 479 478 475 474 461 447 Proportion 0.145 0.145 0.145 0.144 0.144 0.140 0.136
y: Ordinal Outcome
| n | missing | distinct | Info | Mean | Gmd | 
|---|---|---|---|---|---|
| 3293 | 0 | 7 | 0.963 | 4.712 | 1.804 | 
Value 1 2 3 4 5 6 7 Frequency 50 357 297 817 526 757 489 Proportion 0.015 0.108 0.090 0.248 0.160 0.230 0.148
yprev: Previous ordinal outcome
| n | missing | distinct | Info | Mean | Gmd | 
|---|---|---|---|---|---|
| 3293 | 0 | 6 | 0.952 | 4.486 | 1.572 | 
Value 2 3 4 5 6 7 Frequency 366 352 1022 683 610 260 Proportion 0.111 0.107 0.310 0.207 0.185 0.079
gap: Days since last measurement
| n | missing | distinct | Info | Mean | Gmd | 
|---|---|---|---|---|---|
| 3293 | 0 | 4 | 0.796 | 3.892 | 4.216 | 
Value 1 3 7 14 Frequency 1911 474 461 447 Proportion 0.580 0.144 0.140 0.136
Event Chart for Random Sample of Patients
A status timeline is shown for each of 65 randomly chosen patients. The patients are sorted so that the worst outcomes at the last follow-up day are at the top. One day is subtract from the time variable to make follow-up start at day 0.
ssamp <- sample(unique(d$id), 65, FALSE)
dr        <- subset(d, id %in% ssamp)
dr$id     <- as.integer(as.factor(dr$id))
dr$status <- factor(dr$y, levels=7:1)
dr$day    <- dr$day - 1
multEventChart(status ~ day + id, data=dr,
               absorb='1', sortbylast = TRUE) +
  theme_classic() +
  theme(legend.position='bottom')Summary of Successive State Transitions
gg <- propsTrans(y ~ day + id, data=dall, maxsize=6, labels=ordLevels) +
        theme(legend.position='bottom',
        axis.text.x=element_text(angle=90, hjust=1))
ggp(gg)# For a nice event chart for this type of data see https://livefreeordichotomize.com/2020/05/21/survival-model-detective-1Category Proportions Over Time
The following plot shows how the mix of outcomes changes over time stratified also by treatment.
gg <- propsPO(y ~ trt + day, nrow=1, data=dall) +
        theme(legend.position='bottom',
                      axis.text.x=element_text(angle=90, hjust=1))
ggp(gg)Single-Day Analysis
To provide a result similar to the ORCHID’s pre-specified primary analysis, we use a proportional odds model for day 14 status in isolation.
d14 <- subset(dall, day == 14)
d14$y0 <- as.factor(d14$y0)
f <- lrm(y ~ trt + y0 + rcs(age, 3) + race + sofa_nogcs + sex, data=d14)
fLogistic Regression Model
 lrm(formula = y ~ trt + y0 + rcs(age, 3) + race + sofa_nogcs + 
     sex, data = d14)
 
Frequencies of Responses
1 2 3 4 5 6 7 32 42 12 40 37 169 147
| Model Likelihood Ratio Test | Discrimination Indexes | Rank Discrim. Indexes | |
|---|---|---|---|
| Obs 479 | LR χ2 176.28 | R2 0.320 | C 0.731 | 
| max |∂log L/∂β| 2×10-12 | d.f. 10 | g 1.360 | Dxy 0.461 | 
| Pr(>χ2) <0.0001 | gr 3.895 | γ 0.462 | |
| gp 0.180 | τa 0.349 | ||
| Brier 0.106 | 
| β | S.E. | Wald Z | Pr(>|Z|) | |
|---|---|---|---|---|
| y≥2 | 4.2312 | 0.8255 | 5.13 | <0.0001 | 
| y≥3 | 3.0594 | 0.8146 | 3.76 | 0.0002 | 
| y≥4 | 2.8138 | 0.8141 | 3.46 | 0.0005 | 
| y≥5 | 2.1481 | 0.8141 | 2.64 | 0.0083 | 
| y≥6 | 1.6364 | 0.8134 | 2.01 | 0.0442 | 
| y≥7 | -0.2346 | 0.8054 | -0.29 | 0.7708 | 
| trt=hydroxychloroquine | 0.0437 | 0.1701 | 0.26 | 0.7973 | 
| y0=3 | -0.1120 | 0.4477 | -0.25 | 0.8025 | 
| y0=4 | 0.9315 | 0.4550 | 2.05 | 0.0406 | 
| y0=5 | 1.5653 | 0.4925 | 3.18 | 0.0015 | 
| age | -0.0098 | 0.0127 | -0.77 | 0.4395 | 
| age’ | -0.0222 | 0.0153 | -1.45 | 0.1475 | 
| race=white | -0.4846 | 0.2244 | -2.16 | 0.0308 | 
| race=other | 0.0313 | 0.2452 | 0.13 | 0.8985 | 
| sofa_nogcs | -0.2699 | 0.0588 | -4.59 | <0.0001 | 
| sex=male | -0.1971 | 0.1820 | -1.08 | 0.2788 | 
Serial Conditional Markov Ordinal Model
Here we use a first order Markov process where the PO model on a given day is conditional on the ordinal outcome on the previous day as well as on the other variables. The 1-day lagged outcome status is variable yprev. Note that the gaps (1 1 1 3 7 14) between measurement times (1 2 3 4 7 14 28) has a linear correlation with the measurement times of 0.998, so including day as an effect modifier of previous status is similar to allowing the influence of the previous state to be a function of the gap duration.
First do a consolidated analysis in which all observed gap-previous state combinations are modeled as indicator variables. The reference cell here is gap=1 previous state 2. Then look at effects of previous state stratified by distinct levels of gap. Interpretation is clouded by all effects being relative to previous state 2.
d$yprev <- factor(d$yprev)
dd <- datadist(d); options(datadist='dd')
ugaps <- sort(unique(d$gap))
z <- with(d, interaction(gap, yprev, drop=TRUE))
f <- lrm(y ~ z + day + rcs(age, 3) + race + sofa_nogcs + sex, data=d)
co <- coef(f)[grep('z=', names(coef(f)))]
names(co) <- sub('z=', '', names(co))
n  <- names(co)
x  <- strsplit(n, '\\.')
g  <- sapply(x, function(u) as.numeric(u[1]))    # gaps
yp <- sapply(x, function(u) as.numeric(u[2]))    # yprev
ef  <- data.frame(gap=g, yp=yp, effect=unname(co))
# Reference cell in model was gap=1 yprev=2 so add a zero for this
ef <- rbind(data.frame(gap=1, yp=2, effect=0), ef)
print(ef, digits=3)   gap yp effect
1    1  2  0.000
2    3  2 -0.796
3    7  2 -1.626
4   14  2 -5.056
5    1  3  2.365
6    3  3  2.057
7    7  3  1.790
8   14  3  1.104
9    1  4  4.959
10   3  4  6.356
11   7  4  5.427
12  14  4  4.052
13   1  5  7.388
14   3  5  8.255
15   7  5  7.345
16  14  5  5.390
17   1  6  9.311
18   3  6  9.715
19   7  6  8.069
20  14  6  5.656
21   7  7  9.385
22  14  7  7.396# Fit gap trend separately by yprev
ggplot(ef, aes(x=gap, y=effect, col=factor(yp))) + geom_line() + geom_point()
n <- c('yprev=3', 'yprev=4', 'yprev=5', 'yprev=6', 'yprev=7')
k <- matrix(NA, nrow=length(ugaps), ncol=length(n), dimnames=list(ugaps, n))
i <- 0
for(gp in ugaps) {
  i <- i + 1
  s <- subset(d, gap == gp)
  s$yprev <- s$yprev[drop=TRUE]
  form <- if(gp == 1) y ~ day + yprev + rcs(age, 3) + race + sofa_nogcs + sex else
    y ~ yprev + rcs(age, 3) + race + sofa_nogcs + sex
  form <- y ~ yprev
  if(gp == 1) form <- y ~ day + yprev
  f <- lrm(form,  data=s)
  m <- intersect(n, names(coef(f)))
  k[i, m] <- coef(f)[m]
}round(k, 2)   yprev=3 yprev=4 yprev=5 yprev=6 yprev=7
1     4.28    7.99   11.34   23.54      NA
3     2.43    4.80    5.92    6.78      NA
7     1.71    3.19    4.71    5.45    6.72
14    1.68    3.32    4.47    4.91    6.64z <- zp <- NULL
xx <- seq(1, 14, by=.1)
w <- data.frame(ugaps=xx)
for(j in 1 : 5) {
  y <- k[, j]
  mods <- if(j == 5) 'power fixed' else  c('linear spline', 'exponential', 'power', 'power fixed')
  for(model in mods) {
  form <- switch(model,
                 'linear spline' = y ~ lsp(ugaps, 3),
                 'exponential'   = log(y) ~ ugaps,
                 'power'         = log(y) ~ log(ugaps),
                 'power fixed'   = NA)
  if(model != 'power fixed') {
    f <- ols(form)
    if(model == 'power') print(coef(f))
    p <- predict(f, w)
    if(model != 'linear spline') p <- exp(p)
  } else {
    const <- exp(mean(log(y / (ugaps ^ -0.32)), na.rm=TRUE))
    p <- const * (xx ^ -0.32)
  }
  prev <- paste('Previous state:', j + 2)
  z  <- rbind(z,  data.frame(model=model, gap=xx,    y=p, yprev=prev))
  zp <- rbind(zp, data.frame(model=model, gap=ugaps, y=y, yprev=prev)) 
  }
} Intercept      ugaps 
 1.3760590 -0.3708071 
 Intercept      ugaps 
 2.0091496 -0.3570896 
Intercept     ugaps 
 2.320715 -0.356863 
 Intercept      ugaps 
 2.9231801 -0.5866748 ggplot(z, aes(x=gap, y=y, color=model)) + geom_line() +
  geom_point(data=zp, aes(x=gap, y=y)) +
  facet_wrap(~ yprev)By combining the patterns of both approaches, time gap is effectively modeled as a linear spline with knot at 3, which allows for a very different effect at the most command gap of 1d. But the use of two terms for the gap effect will result in a singularity when interacting gap with previous state, because some previous states do not occur at some gaps. The power function specifying a gap effect of a constant times \(d^{-0.32}\) where \(d\) is the time gap fits very well except for previous state 6, which is the only state in the above graph for which you can see a difference between the fitted power function (which estimated an exponent of -0.58) and this function. By using a single parameter (the exponent) to capture the gap effect for all possible gaps one can reduce the number of parameters to estimate by omitting previous state main effects from the model (see below). But again the fact that we are subtracting effects for being in previous state 2 makes this confusing. This will be explored in a better way in the VIOLET dataset where artifical gaps can be created in a way that is independent of absolute time, since in VIOLET all consecutive days had outcome assessments.
In formulating our model we need the following considerations.
- transitions depend on a flexible function of time
- the effect of the previous state may change with time or gap time
- the treatment effect may change with time
- there may be a weird effect where the effect of treatment may differ according to the previous outcome state
Due to a singularity (caused by collinearity), a model could not be fitted that allowed previous state to interact with both day and gap. Also, a singularity resulted when the linear spline is used for gap. There are no records with yprev=7 for gap < 7. So force gap to be linear for now. Let’s fit two models to see which one more strongly interacts, then use that one by itself.
f1 <- lrm(y ~ gap * yprev + pol(day,2) * (trt + rcs(age, 3) +
                 race + sofa_nogcs + sex) +
         yprev * trt,
         data=d)
f2 <- lrm(y ~ gap  + day * (yprev + trt + rcs(age, 3) +
                 race + sofa_nogcs + sex) +
         yprev * trt,
         data=d)
AIC(f1); AIC(f2)   # first model way better[1] 7004.827 [1] 7060.269
f <- f1
anova(f)| Wald Statistics for y | |||
| χ2 | d.f. | P | |
|---|---|---|---|
| gap (Factor+Higher Order Factors) | 110.32 | 6 | <0.0001 | 
| All Interactions | 108.29 | 5 | <0.0001 | 
| yprev (Factor+Higher Order Factors) | 1878.87 | 15 | <0.0001 | 
| All Interactions | 114.56 | 10 | <0.0001 | 
| day (Factor+Higher Order Factors) | 129.38 | 16 | <0.0001 | 
| All Interactions | 45.62 | 14 | <0.0001 | 
| Nonlinear (Factor+Higher Order Factors) | 79.28 | 8 | <0.0001 | 
| trt (Factor+Higher Order Factors) | 7.15 | 8 | 0.5204 | 
| All Interactions | 7.11 | 7 | 0.4177 | 
| age (Factor+Higher Order Factors) | 45.95 | 6 | <0.0001 | 
| All Interactions | 18.47 | 4 | 0.0010 | 
| Nonlinear (Factor+Higher Order Factors) | 3.94 | 3 | 0.2683 | 
| race (Factor+Higher Order Factors) | 12.28 | 6 | 0.0560 | 
| All Interactions | 8.56 | 4 | 0.0732 | 
| sofa_nogcs (Factor+Higher Order Factors) | 29.11 | 3 | <0.0001 | 
| All Interactions | 9.20 | 2 | 0.0100 | 
| sex (Factor+Higher Order Factors) | 1.19 | 3 | 0.7560 | 
| All Interactions | 0.41 | 2 | 0.8151 | 
| gap × yprev (Factor+Higher Order Factors) | 108.29 | 5 | <0.0001 | 
| day × trt (Factor+Higher Order Factors) | 2.45 | 2 | 0.2932 | 
| Nonlinear | 1.24 | 1 | 0.2663 | 
| Nonlinear Interaction : f(A,B) vs. AB | 1.24 | 1 | 0.2663 | 
| day × age (Factor+Higher Order Factors) | 18.47 | 4 | 0.0010 | 
| Nonlinear | 10.50 | 3 | 0.0148 | 
| Nonlinear Interaction : f(A,B) vs. AB | 10.50 | 3 | 0.0148 | 
| f(A,B) vs. Af(B) + Bg(A) | 2.82 | 1 | 0.0933 | 
| Nonlinear Interaction in day vs. Af(B) | 9.85 | 2 | 0.0073 | 
| Nonlinear Interaction in age vs. Bg(A) | 3.43 | 2 | 0.1799 | 
| day × race (Factor+Higher Order Factors) | 8.56 | 4 | 0.0732 | 
| Nonlinear | 7.10 | 2 | 0.0287 | 
| Nonlinear Interaction : f(A,B) vs. AB | 7.10 | 2 | 0.0287 | 
| day × sofa_nogcs (Factor+Higher Order Factors) | 9.20 | 2 | 0.0100 | 
| Nonlinear | 8.86 | 1 | 0.0029 | 
| Nonlinear Interaction : f(A,B) vs. AB | 8.86 | 1 | 0.0029 | 
| day × sex (Factor+Higher Order Factors) | 0.41 | 2 | 0.8151 | 
| Nonlinear | 0.00 | 1 | 0.9892 | 
| Nonlinear Interaction : f(A,B) vs. AB | 0.00 | 1 | 0.9892 | 
| yprev × trt (Factor+Higher Order Factors) | 6.07 | 5 | 0.2990 | 
| TOTAL NONLINEAR | 80.29 | 10 | <0.0001 | 
| TOTAL INTERACTION | 153.84 | 24 | <0.0001 | 
| TOTAL NONLINEAR + INTERACTION | 234.98 | 26 | <0.0001 | 
| TOTAL | 2512.52 | 39 | <0.0001 | 
ggplot(Predict(f, gap, yprev))There was little evidence for a treatment effect at any time, and little evidence that the effect of treatment on state transitions varied with time. There is also little evidence that the treatment effect differed for different previous states.
The graph above shows that being at home without disability is almost an absorbing state, and that the effect of previous state varies with time gap.
For now let’s simplify the model by having previous state only interact with gap (despite some evidence for the age effect varying over time), and removing the interactions with treatment. Now allow the effect of absolute time to be quadratic1. Compare the variances of parameter estimates with cluster sandwich robust estimates.
f <- lrm(y ~ gap * yprev + pol(day, 2) + trt + rcs(age, 3) + race + sofa_nogcs + sex,
         data=d, x=TRUE, y=TRUE)
fLogistic Regression Model
 lrm(formula = y ~ gap * yprev + pol(day, 2) + trt + rcs(age, 
     3) + race + sofa_nogcs + sex, data = d, x = TRUE, y = TRUE)
 
Frequencies of Responses
1 2 3 4 5 6 7 50 357 297 817 526 757 489
| Model Likelihood Ratio Test | Discrimination Indexes | Rank Discrim. Indexes | |
|---|---|---|---|
| Obs 3293 | LR χ2 4766.76 | R2 0.787 | C 0.908 | 
| max |∂log L/∂β| 2×10-9 | d.f. 20 | g 4.186 | Dxy 0.816 | 
| Pr(>χ2) <0.0001 | gr 65.746 | γ 0.821 | |
| gp 0.306 | τa 0.667 | ||
| Brier 0.058 | 
| β | S.E. | Wald Z | Pr(>|Z|) | |
|---|---|---|---|---|
| y≥2 | 2.4712 | 0.3281 | 7.53 | <0.0001 | 
| y≥3 | -1.0620 | 0.3185 | -3.33 | 0.0009 | 
| y≥4 | -2.9688 | 0.3342 | -8.88 | <0.0001 | 
| y≥5 | -6.0554 | 0.3481 | -17.40 | <0.0001 | 
| y≥6 | -7.8957 | 0.3556 | -22.20 | <0.0001 | 
| y≥7 | -10.8037 | 0.3689 | -29.28 | <0.0001 | 
| gap | -0.2184 | 0.0839 | -2.60 | 0.0093 | 
| yprev=3 | 2.2491 | 0.2230 | 10.09 | <0.0001 | 
| yprev=4 | 4.8417 | 0.2303 | 21.02 | <0.0001 | 
| yprev=5 | 7.2783 | 0.2560 | 28.43 | <0.0001 | 
| yprev=6 | 9.3438 | 0.2776 | 33.66 | <0.0001 | 
| yprev=7 | 9.2858 | 0.5779 | 16.07 | <0.0001 | 
| day | 0.2792 | 0.0392 | 7.12 | <0.0001 | 
| day2 | -0.0052 | 0.0007 | -7.01 | <0.0001 | 
| trt=hydroxychloroquine | 0.0134 | 0.0683 | 0.20 | 0.8444 | 
| age | -0.0086 | 0.0049 | -1.75 | 0.0800 | 
| age’ | -0.0034 | 0.0058 | -0.60 | 0.5515 | 
| race=white | -0.0641 | 0.0877 | -0.73 | 0.4650 | 
| race=other | 0.0784 | 0.0951 | 0.82 | 0.4098 | 
| sofa_nogcs | -0.0886 | 0.0205 | -4.31 | <0.0001 | 
| sex=male | -0.0612 | 0.0719 | -0.85 | 0.3948 | 
| gap × yprev=3 | 0.2064 | 0.0613 | 3.37 | 0.0008 | 
| gap × yprev=4 | 0.3304 | 0.0385 | 8.59 | <0.0001 | 
| gap × yprev=5 | 0.2404 | 0.0392 | 6.13 | <0.0001 | 
| gap × yprev=6 | 0.0953 | 0.0332 | 2.87 | 0.0041 | 
| gap × yprev=7 | 0.2302 | 0.0569 | 4.04 | <0.0001 | 
anova(f)| Wald Statistics for y | |||
| χ2 | d.f. | P | |
|---|---|---|---|
| gap (Factor+Higher Order Factors) | 106.77 | 6 | <0.0001 | 
| All Interactions | 104.49 | 5 | <0.0001 | 
| yprev (Factor+Higher Order Factors) | 1891.62 | 10 | <0.0001 | 
| All Interactions | 104.49 | 5 | <0.0001 | 
| day | 84.64 | 2 | <0.0001 | 
| Nonlinear | 49.14 | 1 | <0.0001 | 
| trt | 0.04 | 1 | 0.8444 | 
| age | 26.83 | 2 | <0.0001 | 
| Nonlinear | 0.35 | 1 | 0.5515 | 
| race | 3.14 | 2 | 0.2078 | 
| sofa_nogcs | 18.61 | 1 | <0.0001 | 
| sex | 0.72 | 1 | 0.3948 | 
| gap × yprev (Factor+Higher Order Factors) | 104.49 | 5 | <0.0001 | 
| TOTAL NONLINEAR | 49.55 | 2 | <0.0001 | 
| TOTAL NONLINEAR + INTERACTION | 186.88 | 7 | <0.0001 | 
| TOTAL | 2507.19 | 20 | <0.0001 | 
anova(f, age, race, sex)   # combined effects of 3 variables| Wald Statistics for y | |||
| χ2 | d.f. | P | |
|---|---|---|---|
| age | 26.83 | 2 | <0.0001 | 
| Nonlinear | 0.35 | 1 | 0.5515 | 
| race | 3.14 | 2 | 0.2078 | 
| sex | 0.72 | 1 | 0.3948 | 
| TOTAL | 33.50 | 5 | <0.0001 | 
g <- robcov(f, d$id)
htmlVerbatim(round(diag(vcov(f)) / diag(vcov(g)), 2))
                   y>=2                   y>=3                   y>=4 
                   1.40                   1.31                   1.11 
                   y>=5                   y>=6                   y>=7 
                   0.97                   0.96                   0.91 
                    gap                yprev=3                yprev=4 
                   1.20                   0.85                   0.79 
                yprev=5                yprev=6                yprev=7 
                   0.75                   0.79                   0.84 
                    day                  day^2 trt=hydroxychloroquine 
                   1.52                   0.72                   1.22 
                    age                   age'             race=white 
                   1.53                   1.26                   1.24 
             race=other             sofa_nogcs               sex=male 
                   1.22                   1.16                   1.24 
          gap * yprev=3          gap * yprev=4          gap * yprev=5 
                   0.21                   0.52                   0.52 
          gap * yprev=6          gap * yprev=7 
                   0.57                   0.72 
 
The two variance estimates are similar, giving an indirect indication that the assumption of conditional independence (conditioning on previous state) is reasonable. The variance ratio is 1.2 for treatment (ordinary variance estimate is larger than the intra-cluster correlation-robust estimate).
Proportional Odds Assumption
The treatment is too weak to detect any non-proportional odds. Let’s check the PO assumption on the two strongest variables: gap and yprev. We omit important interactions between them and make day linear to be able to limit the number of regression coefficient estimates to examine as a function of y-cutoff. Expect results for the first cutoff, which corresponds to predicting deaths, to be unstable due to there being only 50 deaths.
n <- c('yprev=3', 'yprev=4', 'yprev=5', 'yprev=6', 'yprev=7', 'day', 'gap')
z <- se <- matrix(NA, nrow=6, ncol=length(n))
dimnames(z) <- dimnames(se) <- list(paste0('y>=', 2:7), n)
for(ycut in 2 : 7) {
    g <- lrm(y >= ycut ~ gap + yprev + day + trt + rcs(age, 3) + race +
                     sofa_nogcs + sex,
                     data=d, tol=1e-14, maxit=30)
    z [ycut - 1, n] <- coef(g)[n]
    se[ycut - 1, n] <- sqrt(diag(vcov(g)[n, n])) 
}
round(z, 2)     yprev=3 yprev=4 yprev=5 yprev=6 yprev=7   day   gap
y>=2    0.41    1.70    2.76    2.71   10.61 -0.30  0.35
y>=3    3.29    5.09    6.43    6.20   13.45 -0.05  0.11
y>=4    1.20    4.82    6.59    6.72   12.62  0.24 -0.32
y>=5    1.94    3.35    6.70    8.10    8.02  0.41 -0.53
y>=6    2.58    3.82    5.01    8.42    7.42  0.64 -0.93
y>=7    2.32    3.06    3.75    4.50    6.08  0.28 -0.35round(se, 2)     yprev=3 yprev=4 yprev=5 yprev=6 yprev=7  day  gap
y>=2    0.47    0.45    0.69    0.64   27.72 0.24 0.47
y>=3    0.24    0.28    0.54    0.54   30.24 0.10 0.20
y>=4    0.26    0.27    0.41    0.55   20.70 0.08 0.16
y>=5    0.42    0.39    0.42    0.54    1.07 0.07 0.13
y>=6    0.55    0.51    0.52    0.56    0.64 0.07 0.14
y>=7    1.08    1.02    1.02    1.02    1.03 0.11 0.20The effect of day and gap are roughly linear in the y-cutoff.
yprev may be predicting death differently from how it predicts the nonfatal outcomes.
The VGAM package vglm function would not converge for a partial PO model that allowed for non-PO in yprev.
Computing Unconditional Estimates
One can uncondition the previous model to obtain unconditional treatment effects, although these will be covariate- and time-dependent. We specify covariate settings (here we use representative values for each covariate, just for convenience, using a method described below) and vary the treatment assignment. Note the following.
- The initial state (baseline ordinal status) is always conditioned on, so is not marginalized out. Calculations can be done separately for each initial state.
- Given this initial state, use the fitted model to get the conditional distribution of Y at (post-randomization) day 1 given the initial state, covariate settings, and treatment.
- Likewise for later days.
- Death is an absorbing state so once death occurs all future status values are also set to death.
- Were time to be continuous one can find the first time for which the probability of being in state y or worse exceeds 0.5, which would define the median time until reaching that state. For the ORCHID study, gaps between measurements are too long for this unless time-interpolation is used. It may be better to focus on estimating the probability of being in state y or worse as a function of days since randomization, treatment, and covariates (covariates include the initial state).
- This approach does not factor in uncertainties in parameter estimates, so provides only a point estimate. The simple Bayesian solution to this is provided later.
Start with computation of a set of representative covariate values that will be used when getting estimates related to transformed parameters. This is done by using the global transition model fitted above, evaluating it at day 14 for a previous state of y=4, and finding covariate settings that give predicted values that are close to the unadjusted estimates for placebo. Of the qualifying settings, one that is subjectively closer to the center of the data distribution is chosen. For simplicity we drop weak categorical predictors sex and race. To better fit we allow time to have a nonlinear effect. First let’s see the distribution of day by yprev.
ggplot(d, aes(x=day)) + geom_histogram() + facet_wrap(~ yprev)# For yprev=7 there are only 2 distinct assessment days (14, 28)
# So need interaction to be restricted to be linear
f <- lrm(y ~ gap * yprev + rcs(day, 4) + trt + rcs(age, 3) + sofa_nogcs,
         data=d)
newdat <- subset(d, day == 1)
newdat <- transform(newdat, day=14, yprev='4', gap=1)
xb1 <- predict(f, newdat)  # uses first intercept by default
# Do the same when covariates are omitted
fu <- lrm(y ~ gap * yprev + rcs(day, 4), data=d)
xb2 <- predict(fu, newdat[1, ])
diff <- abs(xb1 - xb2)
w <- newdat[, c('age', 'sofa_nogcs', 'trt')]
w$diff <- abs(xb1 - xb2)
setDT(w)
wi <- w[order(diff), ]
# Compute mean and median baseline covariates
co <- all.vars(~ age + sofa_nogcs)
setDT(newdat)
h <- function(x) c(mean=mean(x), median=median(x))
u <- function(x) { s <- sapply(x, h); w <- as.list(s)
                   names(w) <- outer(rownames(s), colnames(s), paste, sep='.'); w}
newdat[, u(.SD), .SDcols=co]   mean.age median.age mean.sofa_nogcs median.sofa_nogcs
1: 56.85386         57        2.814196                 2# Seek age=57, sofa 2 or 3
wi[1:30, ]    age sofa_nogcs                trt        diff
 1:  47          7 hydroxychloroquine 0.001365465
 2:  63          5 hydroxychloroquine 0.001857581
 3:  76          3 hydroxychloroquine 0.002198086
 4:  75          3            placebo 0.002726674
 5:  88          1 hydroxychloroquine 0.005283669
 6:  70          4 hydroxychloroquine 0.006873129
 7:  82          2            placebo 0.007587458
 8:  48          7 hydroxychloroquine 0.008507628
 9:  63          5            placebo 0.010995304
10:  45          7            placebo 0.011599411
11:  75          3 hydroxychloroquine 0.011864397
12:  75          3 hydroxychloroquine 0.011864397
13:  83          2 hydroxychloroquine 0.012547716
14:  56          6            placebo 0.014215440
15:  64          5 hydroxychloroquine 0.014776646
16:  64          5 hydroxychloroquine 0.014776646
17:  77          3 hydroxychloroquine 0.016282989
18:  77          3 hydroxychloroquine 0.016282989
19:  48          7            placebo 0.017645351
20:  89          1            placebo 0.017952035
21:  89          1            placebo 0.017952035
22:  89          1            placebo 0.017952035
23:  87          1 hydroxychloroquine 0.019381651
24:  68          4 hydroxychloroquine 0.020411089
25:  71          4 hydroxychloroquine 0.020655362
26:  61          5 hydroxychloroquine 0.023464890
27:  61          5 hydroxychloroquine 0.023464890
28:  80          2 hydroxychloroquine 0.029746227
29:  67          4 hydroxychloroquine 0.033890653
30:  72          4 hydroxychloroquine 0.034516065
    age sofa_nogcs                trt        diff# There are no good compromises so use marginal representative values
repcov <- list(age=57, sofa_nogcs=3)There were no representative covariate combinations that were sufficiently close to unadjusted estimates and close to the center of covariate distributions, so we chose marginally central values.
Compute \(\Pr(Y_s \geq y | X, Y_0)\) analytically for a single value of \((y, Y_0, X, s)\), using this recursive formula where \(k\) is the number of outcome states and \(t\) is time:
\[\Pr(Y_{t} = y | X) = \sum_{j=1}^{k} \Pr(Y_{t} = y | X, Y_{t-1} = j)  \Pr(Y_{t-1} = j | X)\] Note that recent updates to the R Hmisc package make the uprob function below obsolete. The soprobMarkovOrd function will compute state occupancy probabilities for given covariate settings. For Bayesian models Hmisc has soprobMarkovOrdm.
# Compute unconditional probabilities
# Output is a matrix with s rows corresponding to t=times(1), times(2), ...
# Each row has columns corresponding to the probabilities of being in each of the states
# times(1) must be 1
# 
# fit is the model fit which must contain the variable named day and the
# previous state, which must be named by the value of pvarname
# data is a data frame or data table that contains covariate settings
# other than day and what's in pvarname.  data has one row.
# times is the vector of days for which to evaluate state occupancy probabilities
# ylevels are the ordered levels of the outcome variable
# Set gap=TRUE if predictor gap is in the model and gap needs to be computed
# At t=times[1] gap is taken as times[1]
uprob <- function(fit, data, times, ylevels, pvarname='yprev', gap=FALSE) {
  k <- length(ylevels)
  s <- length(times)
  P <- matrix(NA, nrow=s, ncol=k)
  colnames(P) <- as.character(ylevels)
  rownames(P) <- as.character(times)
  # Never uncondition on initial state
  data$day <- times[1]
  if(gap) data$gap <- times[1]
  p <- predict(fit, data, type='fitted.ind')
  P[1, ] <- p
  # Matrix of conditional probabilities of Y conditioning on previous time Y
  # Columns = k conditional probabilities conditional on a single previous state
  # Rows    = all possible previous states
  cp <- matrix(NA, nrow=k, ncol=k, 
               dimnames=list(paste('t-1', ylevels), paste('t', ylevels)))
  data <- as.list(data)
  data[[pvarname]] <- as.character(ylevels[-1])
  # Above: don't request estimates after absorbing state
  edata <- expand.grid(data)
  for(it in 2 : s) {
    edata$day <- times[it]
    if(gap) edata$gap <- times[it] - times[it - 1]
    # edata$gap <- c(1,1,1,1,3,3,3,7,7,7,7,7,7,7,rep(14, 14))[times[it]]
    # y=1 = death = absorbing state
    cp <- rbind(c(1., rep(0., k - 1)), predict(fit, edata, type='fitted.ind'))
    # Compute unconditional probabilities of being in all possible states
    # at current time t
    P[it, ] <- t(cp) %*% P[it - 1, ]
    }
  P
}
dat <- data.frame(yprev='4', age=repcov$age, sofa_nogcs=repcov$sofa_nogcs)
  
times <- c(1, 2, 3, 4, 7, 14, 28)
tlev  <- levels(d$trt)
U <- NULL
for(trt in tlev) {   # separately for treatments
    dat$trt <- trt
    u <- uprob(f, dat, times, 1:7, gap=TRUE)
  cat('\n', trt, '\n\n')
  print(round(u, 3))
  U <- rbind(U, data.frame(trt, day=times, p=u, phome=u[, 6] + u[, 7]))
}
 placebo 
       1     2     3     4     5     6     7
1  0.001 0.037 0.172 0.646 0.118 0.024 0.001
2  0.007 0.085 0.157 0.491 0.173 0.077 0.008
3  0.016 0.100 0.130 0.395 0.194 0.141 0.024
4  0.028 0.107 0.117 0.339 0.189 0.184 0.036
7  0.033 0.076 0.077 0.223 0.209 0.282 0.100
14 0.038 0.051 0.038 0.104 0.150 0.394 0.225
28 0.042 0.035 0.014 0.036 0.056 0.374 0.444
 hydroxychloroquine 
       1     2     3     4     5     6     7
1  0.001 0.036 0.171 0.646 0.119 0.025 0.001
2  0.007 0.084 0.156 0.491 0.174 0.078 0.009
3  0.016 0.098 0.129 0.394 0.195 0.142 0.024
4  0.027 0.106 0.116 0.338 0.190 0.186 0.036
7  0.033 0.075 0.076 0.222 0.209 0.284 0.101
14 0.037 0.050 0.038 0.103 0.149 0.395 0.229
28 0.041 0.034 0.013 0.035 0.055 0.373 0.448Rorchid <- UPlot the unconditional probabilities of being at home at each time, for representative age and sofa score.
xl <- xlab('Day'); yl <- ylab('Probability of Being at Home')
U$txt <- with(U, paste0(trt, '<br>Day:', day, '<br>', round(phome, 3)))
gg <- ggplot(U, aes(x=day, y=phome, color=trt, label=txt)) +
      geom_line() +
      xl + yl + labs(color='')
ggp(gg)Repeat but estimate probabilities at days not observed in the data.
U$type <- 'Actual Times'
for(trt in tlev) {   # separately for treatments
    dat$trt <- trt
    u <- uprob(f, dat, 1:28, 1:7, gap=TRUE)
  cat('\n', trt, '\n\n')
  print(round(u, 3))
  lab <- paste(trt, 'day 1-28')
  U <- rbind(U, data.frame(trt, day=1:28, p=u, phome=u[, 6] + u[, 7],
                                                 type='Days 1-28', txt=''))
}
 placebo 
       1     2     3     4     5     6     7
1  0.001 0.037 0.172 0.646 0.118 0.024 0.001
2  0.007 0.085 0.157 0.491 0.173 0.077 0.008
3  0.016 0.100 0.130 0.395 0.194 0.141 0.024
4  0.028 0.107 0.117 0.339 0.189 0.184 0.036
5  0.046 0.126 0.119 0.317 0.179 0.184 0.030
6  0.084 0.162 0.138 0.309 0.162 0.129 0.014
7  0.170 0.216 0.168 0.275 0.111 0.056 0.004
8  0.342 0.264 0.164 0.173 0.042 0.013 0.001
9  0.609 0.231 0.091 0.058 0.008 0.002 0.000
10 0.867 0.102 0.022 0.009 0.001 0.000 0.000
11 0.980 0.017 0.002 0.001 0.000 0.000 0.000
12 0.999 0.001 0.000 0.000 0.000 0.000 0.000
13 1.000 0.000 0.000 0.000 0.000 0.000 0.000
14 1.000 0.000 0.000 0.000 0.000 0.000 0.000
15 1.000 0.000 0.000 0.000 0.000 0.000 0.000
16 1.000 0.000 0.000 0.000 0.000 0.000 0.000
17 1.000 0.000 0.000 0.000 0.000 0.000 0.000
18 1.000 0.000 0.000 0.000 0.000 0.000 0.000
19 1.000 0.000 0.000 0.000 0.000 0.000 0.000
20 1.000 0.000 0.000 0.000 0.000 0.000 0.000
21 1.000 0.000 0.000 0.000 0.000 0.000 0.000
22 1.000 0.000 0.000 0.000 0.000 0.000 0.000
23 1.000 0.000 0.000 0.000 0.000 0.000 0.000
24 1.000 0.000 0.000 0.000 0.000 0.000 0.000
25 1.000 0.000 0.000 0.000 0.000 0.000 0.000
26 1.000 0.000 0.000 0.000 0.000 0.000 0.000
27 1.000 0.000 0.000 0.000 0.000 0.000 0.000
28 1.000 0.000 0.000 0.000 0.000 0.000 0.000
 hydroxychloroquine 
       1     2     3     4     5     6     7
1  0.001 0.036 0.171 0.646 0.119 0.025 0.001
2  0.007 0.084 0.156 0.491 0.174 0.078 0.009
3  0.016 0.098 0.129 0.394 0.195 0.142 0.024
4  0.027 0.106 0.116 0.338 0.190 0.186 0.036
5  0.045 0.124 0.118 0.316 0.180 0.186 0.030
6  0.083 0.161 0.137 0.309 0.164 0.132 0.015
7  0.167 0.214 0.168 0.277 0.113 0.057 0.005
8  0.337 0.264 0.165 0.176 0.043 0.014 0.001
9  0.604 0.233 0.093 0.060 0.008 0.002 0.000
10 0.863 0.104 0.023 0.009 0.001 0.000 0.000
11 0.979 0.018 0.002 0.001 0.000 0.000 0.000
12 0.999 0.001 0.000 0.000 0.000 0.000 0.000
13 1.000 0.000 0.000 0.000 0.000 0.000 0.000
14 1.000 0.000 0.000 0.000 0.000 0.000 0.000
15 1.000 0.000 0.000 0.000 0.000 0.000 0.000
16 1.000 0.000 0.000 0.000 0.000 0.000 0.000
17 1.000 0.000 0.000 0.000 0.000 0.000 0.000
18 1.000 0.000 0.000 0.000 0.000 0.000 0.000
19 1.000 0.000 0.000 0.000 0.000 0.000 0.000
20 1.000 0.000 0.000 0.000 0.000 0.000 0.000
21 1.000 0.000 0.000 0.000 0.000 0.000 0.000
22 1.000 0.000 0.000 0.000 0.000 0.000 0.000
23 1.000 0.000 0.000 0.000 0.000 0.000 0.000
24 1.000 0.000 0.000 0.000 0.000 0.000 0.000
25 1.000 0.000 0.000 0.000 0.000 0.000 0.000
26 1.000 0.000 0.000 0.000 0.000 0.000 0.000
27 1.000 0.000 0.000 0.000 0.000 0.000 0.000
28 1.000 0.000 0.000 0.000 0.000 0.000 0.000# Plotly doesn't work will with both color and linetype
ggplot(U, aes(x=day, y=phome, color=trt, linetype=type)) + geom_line() +
    xl + yl + labs(color='', linetype='')Everything has gone haywire. This is caused by the fact that both gap and day needed to be in the model to have the right main effects and interactions, but these two variables are collinear. We are requesting predictions above for combinations of day and gap that did not occur in the data, i.e., gap=1 when day> 4. Collinearities cause difficulties when geting predictions for out-of-sample data in which the collinearities disagree with those in the sample used in fitting the model.
So when the design is such that one needs to interact gap with previous state and gap is not randomly assigned or is otherwise tangled with absolute time, it is not possible to get estimates for unobserved times.
Before giving up, we’ll try a model below that does not contain gap, and that flexibly models the day main effect using a spline,
Markov Model Without Covariate Adjustment
Let’s repeat the exercise with no covariate adjustment to check how we are estimating the proportion of patients at home at each observed day. We relax the model to allow it to fit any time trend except that the interaction between day and previous state is restricted to be linear in day. We are making proportional odds assumptions for day, previous state, and treatment. The PO assumption will be less consequential for treatment because of its weak effect.
dfull <- dall[y0 == 4, .(day, y, trt, id)]
# Compute days since previous measurements
s <- dfull[, .(phome = mean(y >= 6)), by=.(trt, day)]
s[, type := 'Empirical']
empirical <- s
# It would be nice to just have day in the model and not gap (which can be
# derived from day), but the collinearities between yprev and day prevent
# this, and we must interact day with yprev - see the crosstab below
d[, table(day, yprev)]    yprev
day    2   3   4   5   6   7
  1   32  55 224 168   0   0
  2   48  70 209 142  10   0
  3   52  73 189 113  51   0
  4   63  57 157  95 103   0
  7   69  50 132  75 148   0
  14  60  35  71  53 129 113
  28  42  12  40  37 169 147# Treat day as numeric but with all needed variables added (less one)
# to make it be modeled as categorical except for the interaction term
# Using records truncated at death
fplain  <- lrm(y ~ gap * yprev + pol(day, 2) + trt, data=d)
# The following has a slightly greater LR chi-square:
fplain2 <- lrm(y ~ rcs(day, 4) + yprev + day %ia% yprev + trt, data=d)
# lrm(y ~ sqrt(day) * yprev + trt, data=d) also fits well
fplainLogistic Regression Model
lrm(formula = y ~ gap * yprev + pol(day, 2) + trt, data = d)
Frequencies of Responses
1 2 3 4 5 6 7 50 357 297 817 526 757 489
| Model Likelihood Ratio Test | Discrimination Indexes | Rank Discrim. Indexes | |
|---|---|---|---|
| Obs 3293 | LR χ2 4705.63 | R2 0.783 | C 0.905 | 
| max |∂log L/∂β| 1×10-9 | d.f. 14 | g 4.121 | Dxy 0.810 | 
| Pr(>χ2) <0.0001 | gr 61.595 | γ 0.825 | |
| gp 0.305 | τa 0.662 | ||
| Brier 0.058 | 
| β | S.E. | Wald Z | Pr(>|Z|) | |
|---|---|---|---|---|
| y≥2 | 1.2998 | 0.1895 | 6.86 | <0.0001 | 
| y≥3 | -2.1762 | 0.1821 | -11.95 | <0.0001 | 
| y≥4 | -4.0680 | 0.2103 | -19.34 | <0.0001 | 
| y≥5 | -7.1255 | 0.2356 | -30.25 | <0.0001 | 
| y≥6 | -8.9419 | 0.2482 | -36.03 | <0.0001 | 
| y≥7 | -11.8166 | 0.2703 | -43.71 | <0.0001 | 
| gap | -0.2125 | 0.0837 | -2.54 | 0.0111 | 
| yprev=3 | 2.4481 | 0.2163 | 11.32 | <0.0001 | 
| yprev=4 | 5.1622 | 0.2192 | 23.56 | <0.0001 | 
| yprev=5 | 7.6626 | 0.2407 | 31.83 | <0.0001 | 
| yprev=6 | 9.7256 | 0.2657 | 36.60 | <0.0001 | 
| yprev=7 | 9.6987 | 0.5701 | 17.01 | <0.0001 | 
| day | 0.2786 | 0.0391 | 7.13 | <0.0001 | 
| day2 | -0.0051 | 0.0007 | -6.94 | <0.0001 | 
| trt=hydroxychloroquine | -0.0131 | 0.0679 | -0.19 | 0.8470 | 
| gap × yprev=3 | 0.2093 | 0.0605 | 3.46 | 0.0005 | 
| gap × yprev=4 | 0.3010 | 0.0381 | 7.91 | <0.0001 | 
| gap × yprev=5 | 0.2109 | 0.0388 | 5.44 | <0.0001 | 
| gap × yprev=6 | 0.0820 | 0.0330 | 2.48 | 0.0130 | 
| gap × yprev=7 | 0.2155 | 0.0566 | 3.80 | 0.0001 | 
anova(fplain)| Wald Statistics for y | |||
| χ2 | d.f. | P | |
|---|---|---|---|
| gap (Factor+Higher Order Factors) | 93.36 | 6 | <0.0001 | 
| All Interactions | 90.84 | 5 | <0.0001 | 
| yprev (Factor+Higher Order Factors) | 2192.50 | 10 | <0.0001 | 
| All Interactions | 90.84 | 5 | <0.0001 | 
| day | 83.98 | 2 | <0.0001 | 
| Nonlinear | 48.22 | 1 | <0.0001 | 
| trt | 0.04 | 1 | 0.8470 | 
| gap × yprev (Factor+Higher Order Factors) | 90.84 | 5 | <0.0001 | 
| TOTAL NONLINEAR + INTERACTION | 170.18 | 6 | <0.0001 | 
| TOTAL | 2506.66 | 14 | <0.0001 | 
dat <- data.frame(yprev='4')
  
times <- c(1, 2, 3, 4, 7, 14, 28)
# gaps  <- c(1, diff(times))
tlev  <- levels(d$trt)
for(trt in tlev) {   # separately for treatments
    dat$trt <- trt
    dat$gap <- 1
    u <- uprob(fplain2, dat, times, 1:7, gap=TRUE)
  cat('\n', trt, '\n\n')
  print(round(u, 3))
  s <- rbind(s, data.frame(trt, day=times, p=u, phome=u[, 6] + u[, 7],
                                                 type='Model'), fill=TRUE)
}
 placebo 
       1     2     3     4     5     6     7
1  0.001 0.036 0.168 0.643 0.124 0.027 0.002
2  0.010 0.092 0.161 0.485 0.164 0.078 0.009
3  0.024 0.113 0.132 0.387 0.184 0.137 0.024
4  0.038 0.110 0.104 0.309 0.192 0.200 0.048
7  0.048 0.087 0.071 0.210 0.194 0.284 0.104
14 0.054 0.059 0.039 0.096 0.135 0.365 0.253
28 0.061 0.042 0.014 0.037 0.053 0.339 0.454
 hydroxychloroquine 
       1     2     3     4     5     6     7
1  0.001 0.037 0.169 0.642 0.123 0.026 0.002
2  0.010 0.093 0.163 0.485 0.163 0.077 0.009
3  0.025 0.114 0.133 0.387 0.182 0.135 0.024
4  0.039 0.112 0.105 0.310 0.191 0.197 0.047
7  0.050 0.089 0.072 0.212 0.194 0.282 0.102
14 0.056 0.060 0.039 0.097 0.136 0.364 0.248
28 0.063 0.043 0.015 0.037 0.054 0.341 0.448ggplot(s, aes(x=day, y=phome, color=trt, linetype=type)) + geom_line() + xl + yl +
    labs(color='', linetype='')Now add estimates for all days 1-28.
for(trt in tlev) {   # separately for treatments
    dat$trt <- trt
    u <- uprob(fplain, dat, 1:28, 1:7, gap=TRUE)
  cat('\n', trt, '\n\n')
  print(round(u, 3))
  s <- rbind(s, data.frame(trt=paste(trt, 'days 1-28'), day=1:28, p=u,
                                                 phome=u[, 6] + u[, 7],
                                                 type='Model', times='Days 1-28'), fill=TRUE)
}
 placebo 
       1     2     3     4     5     6     7
1  0.001 0.033 0.155 0.643 0.136 0.030 0.002
2  0.009 0.086 0.154 0.483 0.173 0.085 0.010
3  0.022 0.108 0.131 0.386 0.185 0.142 0.026
4  0.035 0.107 0.108 0.314 0.187 0.200 0.049
5  0.046 0.094 0.088 0.254 0.181 0.255 0.082
6  0.053 0.076 0.071 0.204 0.169 0.303 0.124
7  0.058 0.056 0.055 0.160 0.152 0.341 0.177
8  0.060 0.039 0.042 0.122 0.132 0.366 0.238
9  0.062 0.025 0.029 0.091 0.111 0.377 0.305
10 0.063 0.015 0.020 0.065 0.090 0.374 0.373
11 0.064 0.008 0.012 0.045 0.071 0.361 0.439
12 0.064 0.004 0.007 0.030 0.055 0.340 0.500
13 0.064 0.002 0.004 0.019 0.043 0.315 0.552
14 0.064 0.001 0.002 0.013 0.034 0.290 0.597
15 0.064 0.000 0.001 0.009 0.027 0.266 0.633
16 0.064 0.000 0.001 0.006 0.022 0.244 0.663
17 0.064 0.000 0.000 0.005 0.019 0.224 0.688
18 0.064 0.000 0.000 0.004 0.016 0.208 0.708
19 0.064 0.000 0.000 0.003 0.015 0.194 0.724
20 0.064 0.000 0.000 0.003 0.013 0.182 0.738
21 0.064 0.000 0.000 0.003 0.012 0.172 0.749
22 0.064 0.000 0.000 0.002 0.011 0.164 0.758
23 0.064 0.000 0.000 0.002 0.011 0.158 0.765
24 0.064 0.000 0.000 0.002 0.010 0.153 0.770
25 0.064 0.000 0.000 0.002 0.010 0.150 0.774
26 0.064 0.000 0.000 0.002 0.010 0.148 0.776
27 0.064 0.000 0.000 0.002 0.010 0.147 0.777
28 0.064 0.000 0.000 0.002 0.010 0.147 0.777
 hydroxychloroquine 
       1     2     3     4     5     6     7
1  0.001 0.033 0.157 0.643 0.135 0.030 0.002
2  0.009 0.087 0.156 0.483 0.171 0.083 0.010
3  0.023 0.110 0.132 0.386 0.183 0.140 0.025
4  0.036 0.109 0.109 0.315 0.186 0.197 0.048
5  0.047 0.096 0.089 0.256 0.181 0.251 0.080
6  0.054 0.078 0.072 0.206 0.169 0.300 0.121
7  0.059 0.058 0.057 0.162 0.153 0.338 0.173
8  0.062 0.040 0.043 0.124 0.133 0.364 0.234
9  0.064 0.026 0.030 0.093 0.112 0.376 0.300
10 0.065 0.016 0.020 0.066 0.091 0.374 0.368
11 0.065 0.009 0.013 0.046 0.072 0.361 0.433
12 0.066 0.005 0.007 0.031 0.056 0.341 0.494
13 0.066 0.002 0.004 0.020 0.044 0.317 0.547
14 0.066 0.001 0.002 0.013 0.034 0.292 0.592
15 0.066 0.000 0.001 0.009 0.027 0.268 0.629
16 0.066 0.000 0.001 0.006 0.022 0.246 0.659
17 0.066 0.000 0.000 0.005 0.019 0.226 0.684
18 0.066 0.000 0.000 0.004 0.017 0.210 0.704
19 0.066 0.000 0.000 0.003 0.015 0.195 0.720
20 0.066 0.000 0.000 0.003 0.013 0.184 0.734
21 0.066 0.000 0.000 0.003 0.012 0.174 0.745
22 0.066 0.000 0.000 0.002 0.012 0.166 0.754
23 0.066 0.000 0.000 0.002 0.011 0.159 0.761
24 0.066 0.000 0.000 0.002 0.010 0.154 0.767
25 0.066 0.000 0.000 0.002 0.010 0.151 0.771
26 0.066 0.000 0.000 0.002 0.010 0.149 0.773
27 0.066 0.000 0.000 0.002 0.010 0.148 0.774
28 0.066 0.000 0.000 0.002 0.010 0.149 0.773ggplot(s, aes(x=day, y=phome, color=trt, linetype=type)) + geom_line() + xl + yl +
    labs(color='', linetype='')Again there is a problem when estimating days not occurring in the data. Try a model that models day flexibly as a main effect, but that interacts it only linearly with previous state.
dat <- data.frame(yprev='4')
  
U <- empirical
for(trt in tlev) {   # separately for treatments
    dat$trt <- trt
    u <- uprob(fplain2, dat, 1:28, 1:7, gap=FALSE)
  U <- rbind(U, data.frame(trt, day=1:28, p=u, phome=u[, 6] + u[, 7],
                                                 type='Model days 1-28'), fill=TRUE)
}
ggplot(U, aes(x=day, y=phome, color=trt, linetype=type)) + geom_line() +
    xl + yl + labs(color='', linetype='')We see the same problem as before.
Checking Validity of Method for Unequal Time Spacing Using VIOLET 2
We saw above that it is not valid to use a simple model fitted on sparse times to estimate state occupancy probabilities at unobserved times. Let’s now check that estimates are valid when observation timings are irregular but estimates are requested at actually occurring times. For this purpose we use the VIOLET 2 trial, which had a 4-level ordinal outcome scale measured on days 1-28 (records terminate at death). We’ll fit a non-covariate-adjusted (except for treatment) model using only days 1 2 3 4 7 14 28. Then we will request state occupancy probabilities at only those days, then at all days. But first start with fitting using all days and estimating all days.
Something strange happened with regard to ventilator use on day 28, so restrict analysis to first 27 days.
dcf <- readRDS('dcf.rds')
setDT(dcf)
dcf <- dcf[day < 28]
# Compute empirical proportions of being at home for those starting In Hospital, not on vent
dfull <- dcf[, .(id, day, bstatus, status, treatment, ddeath)]
emp <- dfull[bstatus == 'In Hospital',
             .(phome = mean(status == 'Home', na.rm=TRUE)),
             by=.(treatment, day)]
emp[, type := 'Empirical']
emp[, day  := unclass(day)]
dv <- dfull[day <= ddeath, ]   # don't carry death forward
s <- dv$bstatus
levels(s) <- list('Vent/ARDS'='ARDS', 'Vent/ARDS'='On Vent',
                                    'In Hospital/Facility'='In Hospital')
dv[, table(bstatus, s)]             s
bstatus       Vent/ARDS In Hospital/Facility
  ARDS             2062                    0
  On Vent          8828                    0
  In Hospital         0                22229dv[, bstatus := s]
setkey(dv, id, day)
dv[, pstatus := shift(status), by=id]
dv[day == 1, pstatus := bstatus]
dv[, pstatus := pstatus[drop=TRUE]]   # drop unused previous status Dead
ggplot(dv, aes(x=day)) + geom_histogram(binwidth=1) + facet_wrap(~ pstatus)dd <- datadist(dv); options(datadist='dd')
# d2 <- dv[bstatus == 'In Hospital/Facility', ]
# Find model with best AIC
for(k in 3 : 10) {
  h <- lrm(status ~ rcs(day, k) * pstatus + treatment, data=dv, maxit=40, tol=1e-14)
  if(k == 7) f27 <- h
  cat(k, 'knots\n')
  print(AIC(h))
}3 knots
[1] 12877.46
4 knots
[1] 12842.47
5 knots
[1] 12832.9
6 knots
[1] 12826.01
7 knots
[1] 12819.5
8 knots
[1] 12823.38
9 knots
[1] 12822.33
10 knots
[1] 12823.41# Use 7 knots
dat <- data.frame(pstatus='In Hospital/Facility')
  
#times <- c(1, 2, 3, 4, 7, 14, 28)
tlev  <- unique(dv$treatment)
ylev  <- levels(dv$status)
times <- 1:28
R <- emp
for(trt in tlev) {   # separately for treatments
    dat$treatment <- trt
    u <- uprob(f27, dat, times, ylev, pvarname='pstatus')
  cat('\n', trt, '\n\n')
  print(round(u, 3))
  w <- data.table(treatment=trt, day=times, p=u, phome=u[, 4], type='Model')
  R <- rbind(R, w, fill=TRUE)
}
 Placebo 
    Dead Vent/ARDS In Hospital/Facility  Home
1  0.000     0.018                0.925 0.056
2  0.001     0.027                0.846 0.127
3  0.002     0.030                0.757 0.211
4  0.003     0.031                0.663 0.304
5  0.004     0.030                0.574 0.392
6  0.005     0.029                0.499 0.467
7  0.007     0.029                0.440 0.525
8  0.008     0.029                0.394 0.569
9  0.010     0.029                0.359 0.603
10 0.012     0.029                0.328 0.631
11 0.014     0.029                0.303 0.655
12 0.016     0.029                0.280 0.676
13 0.018     0.028                0.259 0.695
14 0.020     0.028                0.241 0.711
15 0.022     0.028                0.225 0.725
16 0.024     0.028                0.212 0.737
17 0.026     0.028                0.199 0.747
18 0.028     0.029                0.188 0.755
19 0.030     0.029                0.178 0.763
20 0.032     0.029                0.169 0.770
21 0.034     0.028                0.161 0.776
22 0.036     0.028                0.154 0.782
23 0.038     0.027                0.147 0.787
24 0.040     0.027                0.141 0.792
25 0.042     0.026                0.135 0.797
26 0.045     0.025                0.129 0.801
27 0.047     0.025                0.123 0.805
28 0.049     0.024                0.117 0.809
 Vitamin D 
    Dead Vent/ARDS In Hospital/Facility  Home
1  0.000     0.018                0.925 0.056
2  0.001     0.027                0.845 0.127
3  0.002     0.030                0.756 0.212
4  0.003     0.030                0.662 0.305
5  0.004     0.030                0.573 0.394
6  0.005     0.029                0.497 0.468
7  0.007     0.028                0.438 0.527
8  0.008     0.028                0.393 0.570
9  0.010     0.029                0.357 0.604
10 0.012     0.029                0.327 0.632
11 0.014     0.029                0.301 0.657
12 0.016     0.028                0.278 0.678
13 0.018     0.028                0.258 0.697
14 0.019     0.028                0.240 0.713
15 0.021     0.027                0.224 0.727
16 0.023     0.028                0.210 0.739
17 0.025     0.028                0.198 0.749
18 0.028     0.028                0.187 0.757
19 0.030     0.029                0.177 0.765
20 0.032     0.028                0.168 0.772
21 0.034     0.028                0.160 0.778
22 0.036     0.028                0.153 0.784
23 0.038     0.027                0.146 0.789
24 0.040     0.026                0.140 0.794
25 0.042     0.026                0.134 0.798
26 0.044     0.025                0.128 0.803
27 0.046     0.025                0.122 0.807
28 0.048     0.024                0.116 0.811ggplot(R, aes(x=day, y=phome, color=treatment, linetype=type)) + geom_line() +
    xl + yl + labs(color='', linetype='')
# Fit is barely better beyond day 17 if ran lrm on data table d2The model would probably fit better had a partial proportional odds model been used to relax the proportional odds assumption for day.
Now fit a model as if data collected was restricted to certain days. Need to recompute lags to not let a lag be based on a data that’s no longer sampled.
times <- c(1, 2, 3, 4, 7, 14, 27)
# Compute empirical proportions of being at home for those starting In Hospital, not on vent
dfull <- dcf[day %in% times, .(id, day, bstatus, status, treatment, ddeath)]
emps <- dfull[bstatus == 'In Hospital',
                            .(phome = mean(status == 'Home', na.rm=TRUE)),
                 by=.(treatment, day)]
emps[, type := 'Empirical on selected days']
emps[, day  := unclass(day)]
dv <- dfull[day <= ddeath, ]   # don't carry death forward
s  <- dv[, bstatus]
levels(s) <- list('Vent/ARDS'='ARDS', 'Vent/ARDS'='On Vent',
                                    'In Hospital/Facility'='In Hospital')
dv[, table(bstatus, s)]             s
bstatus       Vent/ARDS In Hospital/Facility
  ARDS              611                    0
  On Vent          2408                    0
  In Hospital         0                 5925dv[, bstatus := s]
setkey(dv, id, day)
dv[, pstatus := shift(status), by=id]
dv[day == 1, pstatus := bstatus]
dv[, pstatus := pstatus[drop=TRUE]]   # drop unused previous status Dead
dv[, gap := ifelse(day == 1, 1, day - shift(day)), by=id]
dv[, table(day, gap)]    gap
day     1    3    7   13
  1  1352    0    0    0
  2  1336    0    0    0
  3  1321    0    0    0
  4  1303    0    0    0
  7     0 1268    0    0
  14    0    0 1212    0
  27    0    0    0 1159ggplot(dv, aes(x=day)) + geom_histogram(binwidth=1) + facet_wrap(~ pstatus)dd <- datadist(dv); options(datadist='dd')
# d2 <- dv[bstatus == 'In Hospital/Facility', ]
# Find transformation giving the best AIC
# Can't use a spline because of too many tied times
for(k in c(0 : 3)) {
    cat('Transformation', k, '\n')
    form <- if(k == 0) status ~ (day + gap) * pstatus + treatment
    else
        if(k == 1) status ~ (sqrt(day) + sqrt(gap)) * pstatus + treatment
    else
        if(k == 2) status ~ (pol(day, 2) + pol(gap, 2)) * pstatus + treatment
    else
        status ~ (pol(sqrt(day), 2) + pol(sqrt(day), 2)) * pstatus + treatment
    h <- lrm(form,  data=dv, maxit=50, tol=1e-14)
    if(k == 2) fsel <- h   # choose quadratic
    print(AIC(h))
}Transformation 0 
[1] 8098.67
Transformation 1 
[1] 8008.668
Transformation 2 
[1] 7864.54
Transformation 3 
[1] 7896.565dat <- data.frame(pstatus='In Hospital/Facility')
  
tlev  <- unique(dv$treatment)
ylev  <- levels(dv$status)
R <- rbind(R, emps, fill=TRUE)
for(trt in tlev) {   # separately for treatments
    dat$treatment <- trt
    u <- uprob(fsel, dat, times, ylev, pvarname='pstatus', gap=TRUE)
  cat('\n', trt, '\n\n')
  print(round(u, 3))
  w <- data.table(treatment=trt, day=times, p=u, phome=u[, 4], type='Model selected days')
  R <- rbind(R, w, fill=TRUE)
}
 Placebo 
    Dead Vent/ARDS In Hospital/Facility  Home
1  0.000     0.039                0.884 0.077
2  0.003     0.059                0.781 0.157
3  0.006     0.066                0.684 0.245
4  0.009     0.064                0.585 0.342
7  0.011     0.051                0.412 0.526
14 0.012     0.029                0.253 0.705
27 0.013     0.025                0.172 0.790
 Vitamin D 
    Dead Vent/ARDS In Hospital/Facility  Home
1  0.000     0.038                0.883 0.079
2  0.002     0.057                0.779 0.161
3  0.005     0.064                0.680 0.251
4  0.009     0.062                0.580 0.349
7  0.011     0.049                0.405 0.536
14 0.011     0.028                0.246 0.714
27 0.013     0.024                0.166 0.798times <- 1:28
dat <- data.frame(pstatus='In Hospital/Facility')
for(trt in tlev) {   # separately for treatments
    dat$treatment <- trt
    u <- uprob(fsel, dat, times, ylev, pvarname='pstatus', gap=TRUE)
  cat('\n', trt, '\n\n')
  print(round(u, 3))
  w <- data.table(treatment=trt, day=times, p=u, phome=u[, 4], type='Model selected days, predict all')
  R <- rbind(R, w, fill=TRUE)
}
 Placebo 
    Dead Vent/ARDS In Hospital/Facility  Home
1  0.000     0.039                0.884 0.077
2  0.003     0.059                0.781 0.157
3  0.006     0.066                0.684 0.245
4  0.009     0.064                0.585 0.342
5  0.013     0.069                0.774 0.145
6  0.051     0.168                0.585 0.195
7  0.258     0.141                0.402 0.199
8  0.471     0.116                0.231 0.181
9  0.668     0.093                0.104 0.134
10 0.819     0.072                0.035 0.074
11 0.911     0.052                0.009 0.028
12 0.958     0.032                0.002 0.008
13 0.981     0.017                0.000 0.002
14 0.993     0.006                0.000 0.000
15 0.998     0.002                0.000 0.000
16 1.000     0.000                0.000 0.000
17 1.000     0.000                0.000 0.000
18 1.000     0.000                0.000 0.000
19 1.000     0.000                0.000 0.000
20 1.000     0.000                0.000 0.000
21 1.000     0.000                0.000 0.000
22 1.000     0.000                0.000 0.000
23 1.000     0.000                0.000 0.000
24 1.000     0.000                0.000 0.000
25 1.000     0.000                0.000 0.000
26 1.000     0.000                0.000 0.000
27 1.000     0.000                0.000 0.000
28 1.000     0.000                0.000 0.000
 Vitamin D 
    Dead Vent/ARDS In Hospital/Facility  Home
1  0.000     0.038                0.883 0.079
2  0.002     0.057                0.779 0.161
3  0.005     0.064                0.680 0.251
4  0.009     0.062                0.580 0.349
5  0.012     0.067                0.773 0.148
6  0.050     0.169                0.581 0.199
7  0.261     0.142                0.396 0.201
8  0.476     0.116                0.226 0.182
9  0.673     0.094                0.101 0.133
10 0.822     0.073                0.033 0.072
11 0.912     0.052                0.008 0.027
12 0.957     0.033                0.002 0.007
13 0.981     0.017                0.001 0.002
14 0.993     0.007                0.000 0.000
15 0.998     0.002                0.000 0.000
16 1.000     0.000                0.000 0.000
17 1.000     0.000                0.000 0.000
18 1.000     0.000                0.000 0.000
19 1.000     0.000                0.000 0.000
20 1.000     0.000                0.000 0.000
21 1.000     0.000                0.000 0.000
22 1.000     0.000                0.000 0.000
23 1.000     0.000                0.000 0.000
24 1.000     0.000                0.000 0.000
25 1.000     0.000                0.000 0.000
26 1.000     0.000                0.000 0.000
27 1.000     0.000                0.000 0.000
28 1.000     0.000                0.000 0.000ggplot(R, aes(x=day, y=phome, color=treatment, linetype=type)) + geom_line() +
    xl + yl + labs(color='', linetype='')This confirms that the Markov model likely handles uneven time spacings, but predictions should be limited to days appearing in the data. As before the problem is due to our attempt to make completely out-of-sample predictions on days not in the data when (1) both day and gap are important but (2) day and gap are collinear.
Better Modeling of Gap Effect in VIOLET 2
The completeness of VIOLET daily follow-up allows us to do another natural experiment where we create a full distribution of gaps by randomly sampling days within patients. For each patient randomly select a day t between 2 and L = last day of data - 6 days. Generate a random gap time r between 1 and L - 2. Remove days t, t+1, … t+r-1.
w <- dcf
w$gap <- NULLsetkey(w, id, day)
w <- w[!is.na(status), ]
# Recode baseline status like follow-up status
levels(w$bstatus) <- list('Vent/ARDS'='ARDS', 'Vent/ARDS'='On Vent',
                                            'In Hospital/Facility'='In Hospital')
w[, .(died=any(status=='Dead')), by=id][, table(died)]died
FALSE  TRUE 
 1156   195 u <- data.table(id=unique(w$id), key='id')
n <- nrow(u)
set.seed(1)
u[, start := sample(2 : 20, n, replace=TRUE)]
u[, width := sample(1 : 15, n, replace=TRUE)]
u[, end   := start + width - 1]
u[end < 27, table(start, width)]      width
start  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
   2   4  8  3  1  5  5  6  3  5  6  3  7  6  5  2
   3   6  4  4  7  6  5  4  3  6  2  6  6 13  3  2
   4   5  6  6  3  9  3  3  5  4  3  6  8  4  2  2
   5   6  4  5  4  4  6  5  5  7  4  1  3  8  3  3
   6   3  6  3  0  4  2  3  9  7  7  8  8  9  2  4
   7   1  3  5  2  3  6  7  5  6  7  5  4  7  3  4
   8   2  9  4  8  3  3  3  5  6  1  2  5  6  9  5
   9   3  3  4  5  8  3  5  4  8  5  7  3  7  4  5
   10 10  6  5  4  4  6  2  4  5  7  5  5  4  7  8
   11  4  7  4  5  6  7 10  6  6  9  3  3  4  2  1
   12  5  2  6  2  4  3  8  1  3  6  4  8  4  6  4
   13  6  3  6  4  6  4  2  4  3  8  2  7  7  2  0
   14  3  7  7  7  4  7  3  4  5  5  5  4  4  0  0
   15  8  4  8  8  7  2  7  4  4  5  4  4  0  0  0
   16  5  3  3  5  6  2  2  6  4  5  2  0  0  0  0
   17  3  9  4  6  7  1  5  4  2 10  0  0  0  0  0
   18  1  4  7  5  4  6  5  2  5  0  0  0  0  0  0
   19  3  6  4  2  0  3  4  3  0  0  0  0  0  0  0
   20  6  4  6  4  3  4  9  0  0  0  0  0  0  0  0z <- w[u]
w <- z[end >= 27 | (day < start | day > end),]
# Recreate previous state and gap with new gaps
w[, pstatus := ifelse(day == 1, as.character(bstatus), shift(as.character(status))), by=id]
w[, gap     := ifelse(day == 1, 1L, day - shift(day)), by=id]
# Don't carry death forward 
w[,  nd := cumsum(status == 'Dead'), by=id]
w <- w[nd < 2, ]
w[, sum(nd > 0)]    # original number of deaths[1] 195w[day > 1, table(gap)]gap
    1     2     3     4     5     6     7     8     9    10    11    12    13 
22788    76    92    84    77    89    73    85    73    78    84    61    71 
   14    15    16 
   78    45    36 with(w, ggfreqScatter(day, gap))
dd <- datadist(w); options(datadist='dd')
f0 <- lrm(status ~ pstatus + rcs(day, 7) + treatment + rcs(age, 4) + sofa + lips,
         data=w, tol=1e-13)
f1 <- lrm(status ~ rcs(gap, 4) + pstatus + rcs(day, 4) + treatment + rcs(age, 4) + sofa + lips,
         data=w, tol=1e-13)
f2 <- lrm(status ~ gap * pstatus + rcs(day, 4) + treatment + rcs(age, 4) + sofa + lips,
         data=w, tol=1e-13)
f3 <- lrm(status ~ lsp(gap, 2) * pstatus + rcs(day, 4) + treatment + rcs(age, 4) + sofa + lips,
         data=w, tol=1e-13)
f4 <- lrm(status ~ lsp(gap, c(2,4)) * pstatus + rcs(day, 4) + treatment + rcs(age, 4) + sofa + lips,
         data=w, tol=1e-13)
f5 <- lrm(status ~ rcs(gap, 4) * pstatus + rcs(day, 4) + treatment + rcs(age, 4) + sofa + lips,
         data=w, tol=1e-13)
c(f0=AIC(f0), f1=AIC(f1), f2=AIC(f2), f3=AIC(f3), f4=AIC(f4), f5=AIC(f5))      f0       f1       f2       f3       f4       f5 
12012.40 11652.23 11625.38 11597.44 11601.45 11600.24 ggplot(Predict(f1, gap, conf.int=FALSE))
ggplot(Predict(f5, gap, pstatus, conf.int=FALSE))anova(f5)| Wald Statistics for status | |||
| χ2 | d.f. | P | |
|---|---|---|---|
| gap (Factor+Higher Order Factors) | 504.72 | 9 | <0.0001 | 
| All Interactions | 129.33 | 6 | <0.0001 | 
| Nonlinear (Factor+Higher Order Factors) | 38.42 | 6 | <0.0001 | 
| pstatus (Factor+Higher Order Factors) | 5172.52 | 8 | <0.0001 | 
| All Interactions | 129.33 | 6 | <0.0001 | 
| day | 112.31 | 3 | <0.0001 | 
| Nonlinear | 19.27 | 2 | <0.0001 | 
| treatment | 0.40 | 1 | 0.5253 | 
| age | 47.59 | 3 | <0.0001 | 
| Nonlinear | 2.34 | 2 | 0.3098 | 
| sofa | 29.49 | 1 | <0.0001 | 
| lips | 7.27 | 1 | 0.0070 | 
| gap × pstatus (Factor+Higher Order Factors) | 129.33 | 6 | <0.0001 | 
| Nonlinear | 15.44 | 4 | 0.0039 | 
| Nonlinear Interaction : f(A,B) vs. AB | 15.44 | 4 | 0.0039 | 
| TOTAL NONLINEAR | 64.92 | 10 | <0.0001 | 
| TOTAL NONLINEAR + INTERACTION | 179.57 | 12 | <0.0001 | 
| TOTAL | 5683.23 | 20 | <0.0001 | 
Before drawing conclusions let’s handle non-PO in time.
require(VGAM)fpol <- vglm(ordered(status) ~ pstatus + day + treatment + rcs(age, 4) + sofa + lips,
          cumulative(reverse=TRUE, parallel=TRUE), data=w)
fpo3 <- vglm(ordered(status) ~ pstatus + rcs(day, 3) + treatment + rcs(age, 4) + sofa + lips,
          cumulative(reverse=TRUE, parallel=TRUE), data=w)
fpo5 <- vglm(ordered(status) ~ pstatus + rcs(day, 5) + treatment + rcs(age, 4) + sofa + lips,
          cumulative(reverse=TRUE, parallel=TRUE), data=w)
f0l <- vglm(ordered(status) ~ pstatus + day + treatment + rcs(age, 4) + sofa + lips,
          cumulative(reverse=TRUE, parallel=FALSE ~ day), data=w)
f03 <- vglm(ordered(status) ~ pstatus + rcs(day, 3) + treatment + rcs(age, 4) + sofa + lips,
          cumulative(reverse=TRUE, parallel=FALSE ~ rcs(day, 3)), data=w) 
f05 <- vglm(ordered(status) ~ pstatus + rcs(day, 5) + treatment + rcs(age, 4) + sofa + lips,
          cumulative(reverse=TRUE, parallel=FALSE ~ rcs(day, 5)), data=w)
f1 <- vglm(ordered(status) ~ pstatus + rcs(gap, 4) + rcs(day, 3) + treatment + rcs(age, 4) + sofa + lips,
          cumulative(reverse=TRUE, parallel=FALSE ~ rcs(day, 3)), data=w)
f2 <- vglm(ordered(status) ~ gap * pstatus + rcs(day, 3) + treatment + rcs(age, 4) + sofa + lips,
          cumulative(reverse=TRUE, parallel=FALSE ~ rcs(day, 3)), data=w)
f3 <- vglm(ordered(status) ~ rcs(gap, 3) * pstatus + rcs(day, 3) + treatment + rcs(age, 4) + sofa + lips,
          cumulative(reverse=TRUE, parallel=FALSE ~ rcs(day, 3)), data=w)
f4 <- vglm(ordered(status) ~ rcs(gap, 4) * pstatus + rcs(day, 3) + treatment + rcs(age, 4) + sofa + lips,
          cumulative(reverse=TRUE, parallel=FALSE ~ rcs(day, 3)), data=w)
c(fpol=AIC(fpol), fpo3=AIC(fpo3), fpo5=AIC(fpo5), f0l=AIC(f0l), f03=AIC(f03), f05=AIC(f05), f1=AIC(f1), f2=AIC(f2), f3=AIC(f3), f4=AIC(f4))    fpol     fpo3     fpo5      f0l      f03      f05       f1       f2 
12044.31 12023.26 12023.96 12039.35 11988.62 11992.68 11639.42 11612.38 
      f3       f4 
11590.84 11592.26 Comparing AICs indicates the following:
- fpolvs.- fpo3: effect of- dayis nonlinear in a PO model
- fpo3vs.- fpo5: 3 knots are adequate for- dayin PO model
- fpo5vs.- f05: moderate non-PO for- day
- f05vs.- f0l: non-PO effect of- dayor main effect of- dayor both are nonlinear
- f05vs.- f03: 3 knots works better for- dayfor combined main and non-PO effects
- f1vs.- f03: main effect is needed for- gap
- f2vs.- f1: previous state interacts weekly with- gap
- f2,- f3,- f4: previous state interaction with- gapis better represented by 3 knots
Odds Ratios for ORCHID
Now get treatment odds ratios for state occupancy probabilities, as a function of time, as opposed to odds of transitions over consecutive time periods.
odds <- function(x) x / (1. - x)
setDT(Rorchid, key=c('trt', 'day'))
b    <- Rorchid[trt == 'hydroxychloroquine', phome]
a    <- Rorchid[trt == 'placebo',            phome]
days <- Rorchid[trt == 'placebo',            day]
or   <- odds(b) / odds(a)
plot(days, or, type='b', xlab='Day', ylab='hydroxy : placebo OR', ylim=c(1, 1.05))
abline(h=exp(coef(f)['trt=hydroxychloroquine']), col=gray(0.85))
text(27, 1.007,
         'Horizontal line is conditional\ntransition OR from unified PO model', adj=1)Bayesian Markov Model for ORCHID
We parallel the regular proportional odds model with a Bayesian Markov proportional hazards model.
stanSet()bmarkov <- blrm(y ~ gap * yprev + pol(day, 2) + trt + rcs(age, 3) +
                sofa_nogcs,
                data=d, refresh=100, file='bmarkov.rds')
htmlVerbatim(stanDx(bmarkov))Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1)
                        n_eff  Rhat
 y>=2                    2662 1.001
 y>=3                    2219 1.001
 y>=4                    2691 1.000
 y>=5                    3632 1.001
 y>=6                    3459 1.001
 y>=7                    2280 1.001
 gap                     2625 1.000
 yprev=3                 3087 1.000
 yprev=4                 3970 1.000
 yprev=5                 3637 1.001
 yprev=6                 2769 1.001
 yprev=7                 2578 1.000
 day                     5192 1.000
 day^2                   5053 1.000
 trt=hydroxychloroquine  5442 1.000
 age                     5505 1.000
 age'                    5513 1.000
 sofa_nogcs              4865 1.000
 gap * yprev=3           4791 1.000
 gap * yprev=4           4223 1.000
 gap * yprev=5           5252 0.999
 gap * yprev=6           5020 0.999
 gap * yprev=7           5482 0.999
 
stanDxplot(bmarkov)
bmarkovBayesian Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.308 for Intercepts
 blrm(formula = y ~ gap * yprev + pol(day, 2) + trt + rcs(age, 
     3) + sofa_nogcs, data = d, refresh = 100, file = "bmarkov.rds")
 
Frequencies of Responses
1 2 3 4 5 6 7 50 357 297 817 526 757 489
| Mixed Calibration/ Discrimination Indexes | Discrimination Indexes | Rank Discrim. Indexes | |
|---|---|---|---|
| Obs 3293 | B 0.084 [0.082, 0.085] | g 4.179 [4.02, 4.356] | C 0.908 [0.907, 0.908] | 
| Draws 4000 | gp 0.442 [0.437, 0.447] | Dxy 0.816 [0.814, 0.817] | |
| Chains 4 | EV 0.625 [0.604, 0.644] | ||
| p 17 | v 13.45 [12.436, 14.582] | ||
| vp 0.155 [0.151, 0.161] | 
| Mode β | Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
|---|---|---|---|---|---|---|---|---|
| y≥2 | 2.4622 | 2.4679 | 2.4675 | 0.3170 | 1.8609 | 3.0937 | 1.0000 | 1.03 | 
| y≥3 | -1.0702 | -1.0803 | -1.0814 | 0.3078 | -1.6511 | -0.4478 | 0.0000 | 0.96 | 
| y≥4 | -2.9740 | -2.9829 | -2.9839 | 0.3222 | -3.6155 | -2.3538 | 0.0000 | 1.00 | 
| y≥5 | -6.0546 | -6.0639 | -6.0613 | 0.3389 | -6.7592 | -5.4398 | 0.0000 | 0.98 | 
| y≥6 | -7.8915 | -7.8991 | -7.8943 | 0.3454 | -8.5890 | -7.2479 | 0.0000 | 0.98 | 
| y≥7 | -10.7955 | -10.8085 | -10.7999 | 0.3591 | -11.5214 | -10.1165 | 0.0000 | 0.97 | 
| gap | -0.2174 | -0.2170 | -0.2165 | 0.0872 | -0.3858 | -0.0496 | 0.0058 | 0.99 | 
| yprev=3 | 2.2423 | 2.2524 | 2.2523 | 0.2273 | 1.8178 | 2.6953 | 1.0000 | 1.03 | 
| yprev=4 | 4.8495 | 4.8570 | 4.8511 | 0.2362 | 4.4060 | 5.3202 | 1.0000 | 1.01 | 
| yprev=5 | 7.2669 | 7.2735 | 7.2694 | 0.2573 | 6.7586 | 7.7535 | 1.0000 | 1.01 | 
| yprev=6 | 9.3374 | 9.3438 | 9.3395 | 0.2811 | 8.8091 | 9.9018 | 1.0000 | 1.01 | 
| yprev=7 | 9.3061 | 9.3066 | 9.3182 | 0.5846 | 8.1136 | 10.4645 | 1.0000 | 0.99 | 
| day | 0.2781 | 0.2788 | 0.2787 | 0.0400 | 0.2036 | 0.3577 | 1.0000 | 1.01 | 
| day2 | -0.0052 | -0.0052 | -0.0052 | 0.0008 | -0.0068 | -0.0038 | 0.0000 | 1.00 | 
| trt=hydroxychloroquine | 0.0113 | 0.0126 | 0.0116 | 0.0694 | -0.1308 | 0.1394 | 0.5685 | 0.98 | 
| age | -0.0092 | -0.0092 | -0.0093 | 0.0050 | -0.0185 | 0.0007 | 0.0330 | 1.01 | 
| age’ | -0.0031 | -0.0031 | -0.0031 | 0.0058 | -0.0147 | 0.0078 | 0.2962 | 1.02 | 
| sofa_nogcs | -0.0901 | -0.0900 | -0.0903 | 0.0204 | -0.1294 | -0.0495 | 0.0000 | 1.02 | 
| gap × yprev=3 | 0.2080 | 0.2063 | 0.2079 | 0.0618 | 0.0791 | 0.3202 | 0.9998 | 0.96 | 
| gap × yprev=4 | 0.3269 | 0.3258 | 0.3255 | 0.0390 | 0.2493 | 0.3995 | 1.0000 | 0.98 | 
| gap × yprev=5 | 0.2403 | 0.2396 | 0.2397 | 0.0394 | 0.1638 | 0.3170 | 1.0000 | 1.00 | 
| gap × yprev=6 | 0.0943 | 0.0940 | 0.0942 | 0.0329 | 0.0327 | 0.1580 | 0.9975 | 0.99 | 
| gap × yprev=7 | 0.2276 | 0.2288 | 0.2284 | 0.0570 | 0.1243 | 0.3456 | 1.0000 | 1.02 | 
We need to be able to compute posterior distributions on state occupancy probabilities to take into account all sources of uncertainty. The follow function computes all needed derived quantities, separately for each posterior draw.
# Compute unconditional probabilities
# Output is an array of dimension number of draws by s rows corresponding to
# t=times(1), times(2), ...
# times(s), and final dimension is the number of Y categories
# times(1) must be 1
# 
# fit is the Bayesian model fit from blrm which must contain the variable named day and the
# previous state, which must be named by the value of pvarname
# data is a data frame or data table that contains covariate settings
# other than day and what's in pvarname.  data has one row.
# times is the vector of days for which to evaluate state occupancy probabilities
# ylevels are the ordered levels of the outcome variable
# Set gap=TRUE to compute time gaps when predictor gap is in the model
uprobB <- function(fit, data, times, ylevels, pvarname='yprev', gap=FALSE) {
    if(! length(fit$draws)) stop('fit does not have posterior draws')
    nd <- nrow(fit$draws)
  k  <- length(ylevels)
  s  <- length(times)
  P  <- array(NA, c(nd, s, k),
                        dimnames=list(paste('draw', 1 : nd), as.character(times), 
                                                    as.character(ylevels)))
  # Never uncondition on initial state
  data$day <- times[1]
  if(gap) data$gap <- times[1]
  p <- predict(fit, data, type='fitted.ind', posterior.summary='all')
  P[, 1, ] <- p
  # cp: matrix of conditional probabilities of Y conditioning on previous time Y
  # Columns = k conditional probabilities conditional on a single previous state
  # Rows    = all possible previous states
  # This is for a single posterior draw
  cp <- matrix(NA, nrow=k, ncol=k, 
               dimnames=list(paste('t-1', ylevels), paste('t', ylevels)))
  data <- as.list(data)
  data[[pvarname]] <- as.character(ylevels[-1])
  # don't request estimates after absorbing state
  edata <- expand.grid(data)
  for(it in 2 : s) {
    edata$day <- times[it]
    if(gap) edata$gap <- times[it] - times[it - 1]
    # y=first level = absorbing state
    pp <- predict(fit, edata, type='fitted.ind', posterior.summary='all')
    for(idraw in 1 : nd) {
        cp <- rbind(c(1., rep(0., k - 1)), pp[idraw, ,])
      # Compute unconditional probabilities of being in all possible states
      # at current time t
      P[idraw, it, ] <- t(cp) %*% P[idraw, it - 1, ]
    }
  }
  P
}
times <- c(1,2,3,4,7,14,28)
ylev <- 1 : 7
tlev <- levels(d$trt)
dat <- data.frame(yprev='4', age=repcov$age, sofa_nogcs=repcov$sofa_nogcs)
R   <- list()
for(trt in tlev) {   # separately for treatments
    dat$trt <- trt
    u <- uprobB(bmarkov, dat, times, ylev, pvarname='yprev', gap=TRUE)
    R[[trt]] <- u
}
# Compute summary measure per treatment, separately for each time and posterior draw
# P(home) = P(Y=6 or Y=7)
s <- function(u) u[, 6] + u[, 7]
# w <- apply(R$placebo, 1, s)    # to test
r <- lapply(R, function(x) apply(x, 1, s))For each treatment and time compute the posterior mean P(home) and 0.95 highest posterior density intervals.
g <- function(x) {
    w <- c(mean(x), HPDint(x))
    names(w) <- c('mean', 'lower', 'upper')
    w
}
rs <- lapply(r, function(x) as.data.frame(t(apply(x, 1, g))))
for(trt in tlev) rs[[trt]]$trt <- trt
rs <- do.call(rbind, rs)
# Fetch day from end of row names (follows .)
rs$day <- as.numeric(sub('^.*\\.(.*?)', '\\2', rownames(rs), ))
xl <- xlab('Day'); yl <- ylab('P(being at home)')
rs$txt <- with(rs, paste0(trt, '<br>Day:', day, '<br>', round(mean, 3)))
gg <- ggplot(rs, aes(x=day, y=mean, color=trt, label=txt)) + geom_line() + 
    geom_ribbon(aes(ymin = lower, ymax = upper), alpha=0.2, linetype=0) +
    xl + yl + labs(color='')
ggp(gg)Instead of individual per-treatment estimates summarize the difference and odds ratio for state occupancy probabilities.
Dif <- r$hydroxychloroquine - r$placebo
dif <- as.data.frame(t(apply(Dif, 1, g)))
dif$day <- as.numeric(sub('^.*\\.(.*?)', '\\2', rownames(dif), ))
dif$txt <- with(dif, paste0('Day:', day, '<br>', round(mean, 3)))
gg <- ggplot(dif, aes(x=day, y=mean, label=txt)) + geom_line() +
    geom_ribbon(aes(ymin=lower, ymax=upper), alpha=0.2, linetype=0) +
    xl + ylab('Hydroxy - Placebo Difference in P(home)')
ggp(gg)or <- odds(r$hydroxychloroquine) / odds(r$placebo)
or <- as.data.frame(t(apply(or, 1, g)))
or$day <- as.numeric(sub('^.*\\.(.*?)', '\\2', rownames(or), ))
or$txt <- with(or, paste0('Day:', day, '<br>', round(mean, 3)))
gg <- ggplot(or, aes(x=day, y=mean, label=txt)) + geom_line() +
    geom_ribbon(aes(ymin=lower, ymax=upper), alpha=0.2, linetype=0) +
    xl + ylab('Hydroxy - Placebo OR for Being at Home')
ggp(gg)Evidence for Efficacy Over Time for One Covariate Combination
Now compute the Bayesian evidence measure for a positive difference in P(home). Note that in the two scatterplots relating posterior draws for differences in occupancy probabilities (P(home)), plotting one day vs. another, no points are on the upper left or lower right quadrants, so the evidence for a positive treatment effect does not vary over the days of follow-up.
plot(Dif[1, ], Dif[5, ], xlab='Posterior Draws for Day 1', ylab='Day 7', cex=.3)
abline(h=0, v=0, col=gray(0.8))
plot(Dif[3, ], Dif[6, ], xlab='Posterior Draws for Day 3', ylab='Day 14', cex=.3)
abline(h=0, v=0, col=gray(0.8))
pp <- apply(Dif, 1, function(x) mean(x > 0))
w <- data.frame(day = as.integer(names(pp)), pp)
w$txt <- with(w, paste0('Day:', day, '<br>', round(pp, 3)))
gg <- ggplot(w, aes(x=day, y=pp, label=txt)) + geom_line() +
    xl + ylab('P(increase in P(home) with hydroxychloroquine)')
ggp(gg)The evidence does vary when computing the posterior probability that treatment has a benefit exceeding an absolute risk difference of 0.02. This is primarily due to the near impossibility of being discharged on the first or second day.
pp <- apply(Dif, 1, function(x) mean(x > 0.02))
w <- data.frame(day = as.integer(names(pp)), pp)
w$txt <- with(w, paste0('Day:', day, '<br>', round(pp, 3)))
gg <- ggplot(w, aes(x=day, y=pp, label=txt)) + geom_line() +
    xl + ylab('P(increase in P(home) with hydroxychloroquine > 0.02)')
ggp(gg)Posterior Distributions of Median and Restricted Mean Time to Home
Even though the data collection times in ORCHID were not granular enough to compute median time until the patient returned home, we compute this anyway, and compute per-treatment posterior means and discrete posterior probability distributions for these medians. For a given posterior sample, the median is the first day for which P(home) >= 0.5 for a treatment. We arbitrary use 30d as the median when the probability never exceeds 0.5. Recall that all these calculations are conditional on the patient being in state 4 (hospitalized on supplemental oxygen) pre-randomization.
days <- c(1,2,3,4,7,14,28)
g <- function(p) if(all(p < 0.5)) 30 else min(days[p >= 0.5])
rmed <- lapply(r, function(x) data.frame(med = apply(x, 2, g)))
for(trt in tlev) rmed[[trt]]$trt <- trt
rmed <- do.call(rbind, rmed)
with(rmed, tapply(med, trt, mean))   # posterior mean for medianhydroxychloroquine            placebo 
                14                 14 with(rmed, table(med, trt))          # discrete distribution for median    trt
med  hydroxychloroquine placebo
  14               4000    4000The sharpness of the posterior distribution of the median time to home is just a reflection of what an easy question we are asking: what is the median of time to home after essentially rounding it to the nearest 7d.
Follow-up was not long enough to compute the mean time until returning home, but we can do inference for the mean restricted time until home (restricted to 28d) assuming that home is an absorbing state (which is approximately but not exactly true). Let \(t_{i}, i=1, 2, \dots, 7\) represent the assessment times (days 1, 2, 3, 4, 7, 14, 28) for time until home \(T\). Under the \(T \leq 28\) restriction, the mean restricted time to home is \(\sum_{i=1}^{7} t_{i} \Pr(T = t_{i}) = \sum_{i=1}^{7} t_{i} [\Pr(T \leq t_{i}) - [i > 1] \Pr(T \leq t_{i-1})]\).
g <- function(p) sum(days * (p - c(0., p[-length(p)])))
rmean <- lapply(r, function(x) data.frame(mean = apply(x, 2, g)))
for(trt in tlev) rmean[[trt]]$trt <- trt
rmean <- do.call(rbind, rmean)
with(rmean, tapply(mean, trt, mean))   # posterior means for mean restricted time to homehydroxychloroquine            placebo 
          10.55112           10.57485 ggplot(rmean, aes(x=mean, color=trt)) + geom_density() +
    labs(color='') + xlab('Restricted Mean Time to Home (days)') + ylab('')Compute the posterior probability that the mean restricted time to home is shorter for hydroxy.
a <- rmean[grep('placebo', rownames(rmean)), 'mean']
b <- rmean[grep('hydrox',  rownames(rmean)), 'mean']
plot(density(b - a), main='', xlab='Difference in Mean Restricted Time to Home', ylab='')
abline(v=0, col=gray(0.85))
mean(b < a)[1] 0.56825Markov Model with Random Effects
Let’s see what difference it makes to add patient-level random effects to the Markov state transition model.
bmarkovre <- blrm(y ~ gap * yprev + pol(day, 2) + trt + rcs(age, 3) +
                sofa_nogcs + cluster(id),
                data=d, refresh=100, file='bmarkovre.rds')
htmlVerbatim(stanDx(bmarkovre))Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1)
                        n_eff  Rhat
 y>=2                    2058 1.000
 y>=3                    1266 1.002
 y>=4                    1485 1.001
 y>=5                    2911 1.000
 y>=6                    2458 1.000
 y>=7                    1585 1.001
 gap                     1950 1.001
 yprev=3                 1880 1.000
 yprev=4                 3299 1.000
 yprev=5                 2750 1.001
 yprev=6                 1627 1.002
 yprev=7                 2037 1.000
 day                     5880 0.999
 day^2                   5076 1.001
 trt=hydroxychloroquine  5040 0.999
 age                     4770 1.001
 age'                    5632 0.999
 sofa_nogcs              5487 0.999
 gap * yprev=3           4795 1.000
 gap * yprev=4           5105 0.999
 gap * yprev=5           4534 0.999
 gap * yprev=6           5255 1.000
 gap * yprev=7           5623 0.999
 sigmag                  1384 1.000
 
stanDxplot(bmarkovre)
bmarkovreBayesian Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.308 for Intercepts
 blrm(formula = y ~ gap * yprev + pol(day, 2) + trt + rcs(age, 
     3) + sofa_nogcs + cluster(id), data = d, refresh = 100, file = "bmarkovre.rds")
 
Frequencies of Responses
1 2 3 4 5 6 7 50 357 297 817 526 757 489
| Mixed Calibration/ Discrimination Indexes | Discrimination Indexes | Rank Discrim. Indexes | |
|---|---|---|---|
| Obs 3293 | B 0.084 [0.082, 0.085] | g 4.184 [4.003, 4.327] | C 0.908 [0.907, 0.908] | 
| Draws 4000 | gp 0.442 [0.437, 0.447] | Dxy 0.816 [0.814, 0.817] | |
| Chains 4 | EV 0.626 [0.604, 0.644] | ||
| p 17 | v 13.469 [12.35, 14.435] | ||
| Cluster on id | vp 0.156 [0.15, 0.16] | ||
| Clusters 479 | |||
| σγ 0.0532 [1e-04, 0.1583] | 
| Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
|---|---|---|---|---|---|---|---|
| y≥2 | 2.4855 | 2.4782 | 0.3151 | 1.8645 | 3.0789 | 1.0000 | 1.00 | 
| y≥3 | -1.0601 | -1.0671 | 0.3088 | -1.6477 | -0.4493 | 0.0003 | 1.01 | 
| y≥4 | -2.9656 | -2.9664 | 0.3227 | -3.5882 | -2.3428 | 0.0000 | 1.03 | 
| y≥5 | -6.0504 | -6.0497 | 0.3357 | -6.7091 | -5.3733 | 0.0000 | 1.06 | 
| y≥6 | -7.8895 | -7.8913 | 0.3462 | -8.6008 | -7.2125 | 0.0000 | 1.03 | 
| y≥7 | -10.8014 | -10.8012 | 0.3590 | -11.4987 | -10.0604 | 0.0000 | 1.04 | 
| gap | -0.2173 | -0.2159 | 0.0838 | -0.3727 | -0.0480 | 0.0048 | 1.03 | 
| yprev=3 | 2.2465 | 2.2485 | 0.2255 | 1.8144 | 2.6793 | 1.0000 | 0.99 | 
| yprev=4 | 4.8489 | 4.8438 | 0.2330 | 4.3954 | 5.3076 | 1.0000 | 1.02 | 
| yprev=5 | 7.2664 | 7.2617 | 0.2616 | 6.7623 | 7.7755 | 1.0000 | 1.01 | 
| yprev=6 | 9.3376 | 9.3361 | 0.2844 | 8.7519 | 9.8522 | 1.0000 | 0.98 | 
| yprev=7 | 9.2789 | 9.2782 | 0.5826 | 8.0628 | 10.3473 | 1.0000 | 1.02 | 
| day | 0.2793 | 0.2797 | 0.0387 | 0.2083 | 0.3589 | 1.0000 | 1.04 | 
| day2 | -0.0052 | -0.0052 | 0.0007 | -0.0067 | -0.0038 | 0.0000 | 1.00 | 
| trt=hydroxychloroquine | 0.0118 | 0.0111 | 0.0684 | -0.1196 | 0.1475 | 0.5662 | 1.01 | 
| age | -0.0093 | -0.0092 | 0.0049 | -0.0183 | 0.0011 | 0.0310 | 1.00 | 
| age’ | -0.0031 | -0.0031 | 0.0057 | -0.0150 | 0.0076 | 0.2855 | 1.00 | 
| sofa_nogcs | -0.0913 | -0.0909 | 0.0207 | -0.1296 | -0.0490 | 0.0000 | 0.93 | 
| gap × yprev=3 | 0.2067 | 0.2060 | 0.0608 | 0.0911 | 0.3231 | 0.9995 | 1.01 | 
| gap × yprev=4 | 0.3279 | 0.3274 | 0.0386 | 0.2517 | 0.4014 | 1.0000 | 1.01 | 
| gap × yprev=5 | 0.2417 | 0.2421 | 0.0397 | 0.1665 | 0.3205 | 1.0000 | 0.98 | 
| gap × yprev=6 | 0.0954 | 0.0950 | 0.0338 | 0.0319 | 0.1614 | 0.9972 | 1.03 | 
| gap × yprev=7 | 0.2321 | 0.2324 | 0.0575 | 0.1205 | 0.3480 | 1.0000 | 1.03 | 
The very low standard deviation of the random effects (0.053) lends credence to the idea that the state transitions are conditionally independent and that random effects are not needed. Also, the posterior mean and median regression parameter values and standard deviations of their posterior distributions changed very little.
Serial Transition Model With Non-Proportional Odds
Return to the Markov model without random effects. Let’s use a partial proportional odds model in which we add departures from the proportional odds assumption with regard to two variables: previous state and time (but not their interaction). We are using the constrained partial proportional odds model in which the constraint imposes linearity in the outcome category (but not for the effect of the previous outcome level).
bmarkovppo <- blrm(y ~ gap * yprev + pol(day, 2) + trt + rcs(age, 3) +
                sofa_nogcs, ~ day + yprev, cppo=function(y) y,
                data=d, refresh=100, file='bmarkovppo.rds')
htmlVerbatim(stanDx(bmarkovppo))Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1)
                        n_eff  Rhat
 y>=2                     524 1.009
 y>=3                     480 1.008
 y>=4                     441 1.009
 y>=5                     589 1.005
 y>=6                     795 1.001
 y>=7                     812 1.003
 gap                      487 1.010
 yprev=3                  978 1.004
 yprev=4                  595 1.006
 yprev=5                  761 1.004
 yprev=6                  902 1.005
 yprev=7                  631 1.009
 day                     2249 1.000
 day^2                   1787 1.001
 trt=hydroxychloroquine  2581 1.000
 age                     2304 1.002
 age'                    2419 1.000
 sofa_nogcs              1996 1.001
 gap * yprev=3           1883 1.003
 gap * yprev=4           2312 0.999
 gap * yprev=5           2078 1.000
 gap * yprev=6           1248 1.001
 gap * yprev=7           1702 1.001
 day x f(y)               485 1.008
 yprev=3 x f(y)           886 1.002
 yprev=4 x f(y)           519 1.008
 yprev=5 x f(y)          1056 1.003
 yprev=6 x f(y)           852 1.005
 yprev=7 x f(y)           549 1.008
 
stanDxplot(bmarkovppo)
bmarkovppoBayesian Constrained Partial Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.308 for Intercepts
 blrm(formula = y ~ gap * yprev + pol(day, 2) + trt + rcs(age, 
     3) + sofa_nogcs, ppo = ~day + yprev, cppo = function(y) y, 
     data = d, refresh = 100, file = "bmarkovppo.rds")
 
Frequencies of Responses
1 2 3 4 5 6 7 50 357 297 817 526 757 489
| Mixed Calibration/ Discrimination Indexes | Discrimination Indexes | Rank Discrim. Indexes | |
|---|---|---|---|
| Obs 3293 | B 0.081 [0.08, 0.082] | g 4.556 [4.243, 4.85] | C 0.886 [0.88, 0.894] | 
| Draws 4000 | gp 0.447 [0.441, 0.452] | Dxy 0.773 [0.76, 0.788] | |
| Chains 4 | EV 0.645 [0.624, 0.667] | ||
| p 17 | v 15.982 [13.802, 18.078] | ||
| vp 0.16 [0.155, 0.165] | 
| Mode β | Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
|---|---|---|---|---|---|---|---|---|
| y≥2 | 2.8421 | 2.8598 | 2.8643 | 0.3338 | 2.2264 | 3.5288 | 1.0000 | 1.00 | 
| y≥3 | -1.4108 | -1.4103 | -1.4075 | 0.3250 | -2.0104 | -0.7244 | 0.0000 | 1.04 | 
| y≥4 | -3.5968 | -3.6227 | -3.6207 | 0.3719 | -4.3768 | -2.8930 | 0.0000 | 0.98 | 
| y≥5 | -7.0605 | -7.1393 | -7.1315 | 0.4372 | -8.0427 | -6.3284 | 0.0000 | 0.95 | 
| y≥6 | -9.2730 | -9.4218 | -9.4100 | 0.5459 | -10.5459 | -8.4240 | 0.0000 | 0.88 | 
| y≥7 | -12.3128 | -12.5330 | -12.4950 | 0.6901 | -13.9470 | -11.2632 | 0.0000 | 0.86 | 
| gap | 0.0851 | 0.0841 | 0.0851 | 0.0888 | -0.0953 | 0.2499 | 0.8238 | 1.00 | 
| yprev=3 | 2.8052 | 2.8386 | 2.8390 | 0.3517 | 2.1649 | 3.5282 | 1.0000 | 1.04 | 
| yprev=4 | 5.8843 | 5.9514 | 5.9393 | 0.3274 | 5.3706 | 6.6288 | 1.0000 | 1.08 | 
| yprev=5 | 9.0448 | 9.1057 | 9.0961 | 0.3553 | 8.3830 | 9.7680 | 1.0000 | 1.06 | 
| yprev=6 | 13.6666 | 13.6957 | 13.6902 | 0.5387 | 12.7090 | 14.8397 | 1.0000 | 1.06 | 
| yprev=7 | 12.0749 | 12.3731 | 12.3376 | 1.0384 | 10.3061 | 14.4209 | 1.0000 | 1.08 | 
| day | 0.3580 | 0.3579 | 0.3578 | 0.0420 | 0.2780 | 0.4427 | 1.0000 | 1.01 | 
| day2 | -0.0080 | -0.0080 | -0.0080 | 0.0007 | -0.0094 | -0.0066 | 0.0000 | 1.00 | 
| trt=hydroxychloroquine | 0.0171 | 0.0142 | 0.0139 | 0.0710 | -0.1138 | 0.1567 | 0.5768 | 1.00 | 
| age | -0.0108 | -0.0109 | -0.0109 | 0.0053 | -0.0217 | -0.0014 | 0.0192 | 0.99 | 
| age’ | -0.0022 | -0.0021 | -0.0020 | 0.0061 | -0.0141 | 0.0095 | 0.3665 | 1.04 | 
| sofa_nogcs | -0.0980 | -0.0981 | -0.0975 | 0.0203 | -0.1376 | -0.0598 | 0.0000 | 0.98 | 
| gap × yprev=3 | -0.0861 | -0.0817 | -0.0823 | 0.0499 | -0.1778 | 0.0124 | 0.0472 | 1.00 | 
| gap × yprev=4 | -0.1911 | -0.1875 | -0.1887 | 0.0377 | -0.2616 | -0.1127 | 0.0000 | 1.06 | 
| gap × yprev=5 | -0.2893 | -0.2861 | -0.2866 | 0.0471 | -0.3745 | -0.1887 | 0.0000 | 1.03 | 
| gap × yprev=6 | -0.4627 | -0.4601 | -0.4604 | 0.0441 | -0.5441 | -0.3748 | 0.0000 | 1.01 | 
| gap × yprev=7 | -0.3354 | -0.3330 | -0.3340 | 0.0635 | -0.4538 | -0.2100 | 0.0000 | 1.07 | 
| day x f(y) | 0.1202 | 0.1197 | 0.1198 | 0.0068 | 0.1074 | 0.1338 | 1.0000 | 1.02 | 
| yprev=3 x f(y) | -0.1045 | -0.0802 | -0.0785 | 0.2495 | -0.5887 | 0.3980 | 0.3650 | 1.03 | 
| yprev=4 x f(y) | 0.1093 | 0.2008 | 0.1768 | 0.1969 | -0.1366 | 0.6030 | 0.8655 | 1.50 | 
| yprev=5 x f(y) | -0.9969 | -0.8857 | -0.8974 | 0.2864 | -1.4536 | -0.3492 | 0.0035 | 1.12 | 
| yprev=6 x f(y) | -2.7648 | -2.6253 | -2.6180 | 0.3922 | -3.4177 | -1.8523 | 0.0000 | 0.94 | 
| yprev=7 x f(y) | -1.6179 | -1.6577 | -1.6263 | 0.6149 | -2.9022 | -0.5181 | 0.0018 | 0.87 | 
# htmlVerbatim(compareBmods(bmarkovppo, bmarkov))
# Note: bmarkovppo fit with loo=TRUE was not saved because it was 171MB for some reason
# Output:
# Method: stacking
# ------
#        weight
#  model1 0.957 
#  model2 0.043 
#  
#  Compute posterior means divided by SE
round(coef(bmarkovppo) / sqrt(diag(vcov(bmarkovppo))), 2)              y>=2                   y>=3                   y>=4 
              8.57                  -4.34                  -9.74 
              y>=5                   y>=6                   y>=7 
            -16.33                 -17.26                 -18.16 
               gap                yprev=3                yprev=4 
              0.95                   8.07                  18.18 
           yprev=5                yprev=6                yprev=7 
             25.63                  25.42                  11.92 
               day                  day^2 trt=hydroxychloroquine 
              8.53                 -10.99                   0.20 
               age                   age'             sofa_nogcs 
             -2.07                  -0.34                  -4.84 
     gap * yprev=3          gap * yprev=4          gap * yprev=5 
             -1.64                  -4.98                  -6.07 
     gap * yprev=6          gap * yprev=7             day x f(y) 
            -10.44                  -5.24                  17.72 
    yprev=3 x f(y)         yprev=4 x f(y)         yprev=5 x f(y) 
             -0.32                   1.02                  -3.09 
    yprev=6 x f(y)         yprev=7 x f(y) 
             -6.69                  -2.70 There were some minor sampling/convergence problems especially with respect to the parameters for departures from proportional odds.
The model comparison results listed in the code above puts a great likelihood on the partial proportional odds model fitting better than the proportional odds model. The result of compareBmods, not saved in the code because the loo=TRUE option makes the fit object much bigger, is here (but from a slightly different fit):
Method: stacking
------
       weight
 model1 0.957 
 model2 0.043 
Based on posterior mean parameter values and standard deviations of posterior distributions (standard errors), the most important departures from PO seem to be time-related and the effect of previously being in state 6.
Final Model With Non-PO With Respect to Time
Our final model will have as non-proportional odds only the effect of time, but we will now allow the time effect to be nonlinear (quadratic). As before the cppo function assumes that the PO departure is proportional to the numeric 1-7 outcome codes. A fully unconstrained partial proportional odds model would have too many parameters.
bmarkovppof <- blrm(y ~ gap * yprev + pol(day, 2) + trt + rcs(age, 3) +
                sofa_nogcs, ~ pol(day, 2), cppo=function(y) y,
                data=d, refresh=100, file='bmarkovppof.rds')
htmlVerbatim(stanDx(bmarkovppof))Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1)
                        n_eff  Rhat
 y>=2                    3155 1.001
 y>=3                    2772 1.001
 y>=4                    3497 1.000
 y>=5                    4565 1.000
 y>=6                    3112 1.000
 y>=7                    2744 1.000
 gap                     3701 1.000
 yprev=3                 3236 1.001
 yprev=4                 4312 1.000
 yprev=5                 3439 1.001
 yprev=6                 2914 1.001
 yprev=7                 3443 1.000
 day                     5998 1.000
 day^2                   4539 1.000
 trt=hydroxychloroquine  7707 0.999
 age                     6236 1.000
 age'                    6518 1.000
 sofa_nogcs              5643 1.000
 gap * yprev=3           5416 0.999
 gap * yprev=4           7418 1.000
 gap * yprev=5           6227 0.999
 gap * yprev=6           4029 1.000
 gap * yprev=7           4523 1.000
 day x f(y)              2652 1.000
 day^2 x f(y)            4295 1.001
 
stanDxplot(bmarkovppof)
bmarkovppofBayesian Constrained Partial Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.308 for Intercepts
 blrm(formula = y ~ gap * yprev + pol(day, 2) + trt + rcs(age, 
     3) + sofa_nogcs, ppo = ~pol(day, 2), cppo = function(y) y, 
     data = d, refresh = 100, file = "bmarkovppof.rds")
 
Frequencies of Responses
1 2 3 4 5 6 7 50 357 297 817 526 757 489
| Mixed Calibration/ Discrimination Indexes | Discrimination Indexes | Rank Discrim. Indexes | |
|---|---|---|---|
| Obs 3293 | B 0.081 [0.08, 0.082] | g 3.805 [3.639, 3.941] | C 0.884 [0.877, 0.892] | 
| Draws 4000 | gp 0.438 [0.433, 0.443] | Dxy 0.768 [0.754, 0.784] | |
| Chains 4 | EV 0.615 [0.596, 0.635] | ||
| p 17 | v 11.362 [10.498, 12.293] | ||
| vp 0.153 [0.149, 0.159] | 
| Mode β | Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
|---|---|---|---|---|---|---|---|---|
| y≥2 | 4.4378 | 4.4540 | 4.4456 | 0.3640 | 3.7344 | 5.1346 | 1.0000 | 0.94 | 
| y≥3 | -0.4172 | -0.4190 | -0.4177 | 0.3262 | -1.1102 | 0.1631 | 0.0980 | 0.95 | 
| y≥4 | -3.2193 | -3.2292 | -3.2277 | 0.3334 | -3.8614 | -2.5610 | 0.0000 | 0.97 | 
| y≥5 | -7.1431 | -7.1566 | -7.1519 | 0.3534 | -7.8163 | -6.4290 | 0.0000 | 0.96 | 
| y≥6 | -10.1971 | -10.2173 | -10.2167 | 0.3874 | -10.9379 | -9.4161 | 0.0000 | 0.98 | 
| y≥7 | -15.0652 | -15.0933 | -15.0933 | 0.4446 | -15.9659 | -14.2205 | 0.0000 | 0.99 | 
| gap | -0.0655 | -0.0663 | -0.0672 | 0.0887 | -0.2413 | 0.1038 | 0.2285 | 0.98 | 
| yprev=3 | 2.7151 | 2.7239 | 2.7242 | 0.2236 | 2.2572 | 3.1256 | 1.0000 | 1.00 | 
| yprev=4 | 5.8959 | 5.9090 | 5.9062 | 0.2396 | 5.4725 | 6.3953 | 1.0000 | 0.99 | 
| yprev=5 | 8.9304 | 8.9504 | 8.9518 | 0.2800 | 8.4281 | 9.5012 | 1.0000 | 1.05 | 
| yprev=6 | 11.2825 | 11.3124 | 11.3110 | 0.3179 | 10.6828 | 11.9249 | 1.0000 | 1.06 | 
| yprev=7 | 9.9164 | 9.9414 | 9.9370 | 0.6189 | 8.7525 | 11.1980 | 1.0000 | 0.94 | 
| day | 0.3923 | 0.3917 | 0.3912 | 0.0425 | 0.3121 | 0.4794 | 1.0000 | 1.07 | 
| day2 | -0.0053 | -0.0053 | -0.0053 | 0.0007 | -0.0067 | -0.0038 | 0.0000 | 0.97 | 
| trt=hydroxychloroquine | 0.0257 | 0.0264 | 0.0260 | 0.0716 | -0.1048 | 0.1767 | 0.6408 | 1.03 | 
| age | -0.0105 | -0.0105 | -0.0104 | 0.0052 | -0.0214 | -0.0011 | 0.0205 | 0.96 | 
| age’ | -0.0027 | -0.0027 | -0.0027 | 0.0059 | -0.0136 | 0.0097 | 0.3157 | 1.01 | 
| sofa_nogcs | -0.1082 | -0.1084 | -0.1085 | 0.0201 | -0.1475 | -0.0689 | 0.0000 | 1.06 | 
| gap × yprev=3 | -0.0726 | -0.0704 | -0.0707 | 0.0460 | -0.1642 | 0.0143 | 0.0640 | 1.04 | 
| gap × yprev=4 | -0.2130 | -0.2101 | -0.2110 | 0.0366 | -0.2816 | -0.1406 | 0.0000 | 1.08 | 
| gap × yprev=5 | -0.3967 | -0.3938 | -0.3936 | 0.0411 | -0.4736 | -0.3125 | 0.0000 | 1.01 | 
| gap × yprev=6 | -0.5680 | -0.5657 | -0.5660 | 0.0383 | -0.6388 | -0.4888 | 0.0000 | 1.08 | 
| gap × yprev=7 | -0.3339 | -0.3299 | -0.3308 | 0.0601 | -0.4405 | -0.2046 | 0.0000 | 1.05 | 
| day x f(y) | 0.3454 | 0.3457 | 0.3457 | 0.0199 | 0.3057 | 0.3837 | 1.0000 | 0.99 | 
| day2 x f(y) | -0.0066 | -0.0066 | -0.0066 | 0.0006 | -0.0078 | -0.0054 | 0.0000 | 1.05 | 
There is strong evidence that the departure from PO with respect to day of follow-up is nonlinear.
Final Check
Let’s fit a Bayesian transition model without covariates (other than treatment and initial state) to once again check against empirical proportions of being at home.
bmarkovppon <- blrm(y ~ gap * yprev + pol(day, 2) + trt, ~ pol(day, 2),
                                        cppo=function(y) y,
                      data=d, refresh=100, file='bmarkovppon.rds')
htmlVerbatim(stanDx(bmarkovppon))Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1)
                        n_eff  Rhat
 y>=2                    3039 1.000
 y>=3                    2804 1.001
 y>=4                    3198 1.001
 y>=5                    4132 1.000
 y>=6                    3108 1.000
 y>=7                    2452 1.000
 gap                     3223 1.002
 yprev=3                 2929 1.000
 yprev=4                 3561 1.000
 yprev=5                 3773 1.000
 yprev=6                 2980 0.999
 yprev=7                 3157 1.000
 day                     5230 1.000
 day^2                   4511 0.999
 trt=hydroxychloroquine  5522 1.000
 gap * yprev=3           4918 0.999
 gap * yprev=4           5557 1.000
 gap * yprev=5           4704 1.000
 gap * yprev=6           3538 1.000
 gap * yprev=7           3973 1.001
 day x f(y)              2504 1.000
 day^2 x f(y)            4184 0.999
 
bmarkovpponBayesian Constrained Partial Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.308 for Intercepts
 blrm(formula = y ~ gap * yprev + pol(day, 2) + trt, ppo = ~pol(day, 
     2), cppo = function(y) y, data = d, refresh = 100, file = "bmarkovppon.rds")
 
Frequencies of Responses
1 2 3 4 5 6 7 50 357 297 817 526 757 489
| Mixed Calibration/ Discrimination Indexes | Discrimination Indexes | Rank Discrim. Indexes | |
|---|---|---|---|
| Obs 3293 | B 0.083 [0.082, 0.084] | g 3.723 [3.602, 3.865] | C 0.884 [0.875, 0.895] | 
| Draws 4000 | gp 0.436 [0.431, 0.442] | Dxy 0.768 [0.751, 0.79] | |
| Chains 4 | EV 0.611 [0.59, 0.632] | ||
| p 14 | v 10.918 [10.193, 11.744] | ||
| vp 0.152 [0.147, 0.157] | 
| Mode β | Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
|---|---|---|---|---|---|---|---|---|
| y≥2 | 3.0582 | 3.0613 | 3.0559 | 0.2542 | 2.5698 | 3.5447 | 1.0000 | 1.04 | 
| y≥3 | -1.7124 | -1.7184 | -1.7170 | 0.2084 | -2.1268 | -1.3098 | 0.0000 | 1.02 | 
| y≥4 | -4.4747 | -4.4817 | -4.4835 | 0.2230 | -4.9316 | -4.0574 | 0.0000 | 1.00 | 
| y≥5 | -8.3492 | -8.3536 | -8.3546 | 0.2506 | -8.8241 | -7.8498 | 0.0000 | 0.99 | 
| y≥6 | -11.3550 | -11.3588 | -11.3569 | 0.2932 | -11.9503 | -10.7758 | 0.0000 | 0.96 | 
| y≥7 | -16.1466 | -16.1465 | -16.1463 | 0.3684 | -16.9045 | -15.4516 | 0.0000 | 0.97 | 
| gap | -0.0601 | -0.0610 | -0.0611 | 0.0872 | -0.2403 | 0.1043 | 0.2380 | 0.98 | 
| yprev=3 | 2.9563 | 2.9623 | 2.9583 | 0.2233 | 2.5177 | 3.4066 | 1.0000 | 1.02 | 
| yprev=4 | 6.2331 | 6.2370 | 6.2370 | 0.2320 | 5.7860 | 6.6868 | 1.0000 | 1.00 | 
| yprev=5 | 9.3574 | 9.3595 | 9.3578 | 0.2661 | 8.8260 | 9.8676 | 1.0000 | 1.03 | 
| yprev=6 | 11.7001 | 11.6994 | 11.6996 | 0.3149 | 11.0668 | 12.2847 | 1.0000 | 0.98 | 
| yprev=7 | 10.3873 | 10.3768 | 10.3723 | 0.6030 | 9.1319 | 11.4835 | 1.0000 | 1.04 | 
| day | 0.3894 | 0.3892 | 0.3891 | 0.0416 | 0.3068 | 0.4689 | 1.0000 | 1.01 | 
| day2 | -0.0053 | -0.0053 | -0.0053 | 0.0008 | -0.0067 | -0.0037 | 0.0000 | 1.01 | 
| trt=hydroxychloroquine | 0.0029 | 0.0029 | 0.0016 | 0.0692 | -0.1220 | 0.1461 | 0.5115 | 1.03 | 
| gap × yprev=3 | -0.0700 | -0.0689 | -0.0696 | 0.0473 | -0.1603 | 0.0202 | 0.0735 | 1.00 | 
| gap × yprev=4 | -0.2336 | -0.2304 | -0.2308 | 0.0363 | -0.3013 | -0.1599 | 0.0000 | 1.00 | 
| gap × yprev=5 | -0.4206 | -0.4170 | -0.4168 | 0.0413 | -0.4960 | -0.3342 | 0.0000 | 1.00 | 
| gap × yprev=6 | -0.5733 | -0.5692 | -0.5690 | 0.0388 | -0.6436 | -0.4933 | 0.0000 | 1.00 | 
| gap × yprev=7 | -0.3432 | -0.3370 | -0.3363 | 0.0610 | -0.4566 | -0.2188 | 0.0000 | 0.99 | 
| day x f(y) | 0.3382 | 0.3374 | 0.3372 | 0.0190 | 0.3011 | 0.3736 | 1.0000 | 1.03 | 
| day2 x f(y) | -0.0064 | -0.0064 | -0.0064 | 0.0006 | -0.0076 | -0.0053 | 0.0000 | 0.99 | 
dat <- data.frame(yprev='4')
R   <- list()
for(trt in tlev) {   # separately for treatments
    dat$trt <- trt
    u <- uprobB(bmarkovppon, dat, times, ylev, pvarname='yprev', gap=TRUE)
    R[[trt]] <- u
}
# Compute summary measure per treatment, separately for each time and posterior draw
# P(home) = P(Y=6 or Y=7)
s <- function(u) u[, 6] + u[, 7]
# w <- apply(R$placebo, 1, s)    # to test
r <- lapply(R, function(x) apply(x, 1, s))
# For each treatment and time compute the posterior mean P(home)
rs <- lapply(r, function(x) data.frame(phome = apply(x, 1, mean)))
for(trt in tlev) rs[[trt]]$trt <- trt
rs <- do.call(rbind, rs)
# Fetch day from end of row names (follows .)
rs$day <- as.numeric(sub('^.*\\.(.*?)', '\\2', rownames(rs), ))
rs$type <- 'Model'
rs <- rbind(rs, empirical)
ggplot(rs, aes(x=day, y=phome, color=trt, linetype=type)) + geom_line() + xl + yl +
    labs(color='', linetype='')The agreement isn’t perfect but it’s probably the case that the sample size is not large enough to trust the empirical proportions.
A Different Markov Ordinal Model for Unequal Gaps
Consider a different approach in which we model serial dependence by attenuating the previous state effect by multiplying it by the reciprocal of gap time raised to a positive power. This has the properties of (1) forcing Y to be the same as the previous Y as the gap time approaches zero and (2) forcing independence across outcomes within patient as the gap time goes to infinity. To induce very high correlations in Y within patient with very small gaps, effects must approach \(\pm \infty\) on the logic scale, and the reciprocal raised to a power allows this. It is important to note that the effects being modeled must account for strong non-proportional odds. It is easy to see why when considering continuous Y with Gaussian residuals. A continuous-time AR(1) process is the Ornstein-Uhlenbeck process, and this process specifies that the variance of Y changes with the time gap. When the gap is very small, the conditional distribution of Y given the previous Y has a very small variance because in that case Y is very predictable from the previous Y. The analog to changing variance is changing the spacings of intercepts in a PO model. We accomplish that with a constrained partial PO model, the constraint being, as before, that the departure from PO is linear in Y. The question is whether we also need non-PO in absolute follow-up time. Logically, we do, but collinearity with gap may get in the way. We model serial dependence as a reciprocal of the cube root of the gap. The exponent really needs to be estimated from the data, which will require a nonlinear model. Our formulation does not allow main effects of previous states separate from the time gap to be included, so we are forcing the exponential function of gap to always be multiplied by effects of previous states.
A variety of functions for gap discounting of previous states, but none of them resulted in reasonable estimates of occupancy probabilities beyond the first gap > 1d. The last attempt is below. All of this points to the need for a true continuous time model when there is a need to estimate occupancy probabilities at unobserved times.
d$ypn <- NULLrm(trt)
w <- d
ypn <- as.numeric(as.character(w$yprev))
w <- as.data.frame(w)
alpha <- 1.
lambda <- 0.15
gamma <- 2
X <- with(w, 1 * cbind(ypn == 2, ypn == 3, ypn == 4, ypn == 5, ypn == 6, ypn == 7))
# w$yp  <- X
w$ypr <- X / (w$gap ^ alpha)
w$ypr <- X * exp(-lambda * w$gap)
w$ypr <- X * (gamma + exp(-lambda * w$gap))
# model.matrix problem if yp is logical instead of numeric
bmarkovppor <- blrm(y ~ ypr + pol(day, 2) + trt, ~ pol(day, 2) + ypr,
                                        cppo=function(y) y,
                      data=w, refresh=100, file='bmarkovppor.rds')
b <- bmarkovppor
htmlVerbatim(stanDx(b))Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1)
                        n_eff  Rhat
 y>=2                    1225 0.999
 y>=3                    1046 1.000
 y>=4                     960 1.000
 y>=5                    1272 1.000
 y>=6                    2215 1.000
 y>=7                    2524 1.001
 ypr[1]                  2123 1.001
 ypr[2]                  1581 1.000
 ypr[3]                   963 1.000
 ypr[4]                  1009 1.001
 ypr[5]                  2231 1.002
 ypr[6]                  1613 1.001
 day                     5208 1.000
 day^2                   6050 0.999
 trt=hydroxychloroquine  5684 1.000
 day x f(y)              1075 1.002
 day^2 x f(y)            2040 1.000
 ypr[1] x f(y)           1789 1.002
 ypr[2] x f(y)           2679 1.000
 ypr[3] x f(y)           1364 1.000
 ypr[4] x f(y)           1313 1.000
 ypr[5] x f(y)           2125 1.002
 ypr[6] x f(y)           3964 1.000
 
bBayesian Constrained Partial Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.308 for Intercepts
 blrm(formula = y ~ ypr + pol(day, 2) + trt, ppo = ~pol(day, 2) + 
     ypr, cppo = function(y) y, data = w, refresh = 100, file = "bmarkovppor.rds")
 
Frequencies of Responses
1 2 3 4 5 6 7 50 357 297 817 526 757 489
| Mixed Calibration/ Discrimination Indexes | Discrimination Indexes | Rank Discrim. Indexes | |
|---|---|---|---|
| Obs 3293 | B 0.084 [0.083, 0.085] | g 4.511 [4.2, 4.819] | C 0.884 [0.879, 0.893] | 
| Draws 4000 | gp 0.449 [0.443, 0.453] | Dxy 0.769 [0.759, 0.785] | |
| Chains 4 | EV 0.651 [0.632, 0.671] | ||
| p 9 | v 15.665 [13.51, 17.901] | ||
| vp 0.162 [0.157, 0.167] | 
| Mode β | Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
|---|---|---|---|---|---|---|---|---|
| y≥2 | 9.0466 | 9.0329 | 9.0314 | 5.5668 | -1.1526 | 20.2638 | 0.9502 | 1.02 | 
| y≥3 | 1.9489 | 1.9434 | 1.9365 | 3.9782 | -5.9721 | 9.5441 | 0.6892 | 1.02 | 
| y≥4 | -3.1405 | -3.1685 | -3.1378 | 2.7501 | -8.7864 | 2.0215 | 0.1207 | 0.95 | 
| y≥5 | -9.5137 | -9.5646 | -9.5349 | 2.4891 | -14.7573 | -4.9412 | 0.0000 | 0.96 | 
| y≥6 | -14.3301 | -14.4224 | -14.4053 | 3.4199 | -20.9956 | -7.8009 | 0.0000 | 1.02 | 
| y≥7 | -20.1491 | -20.2917 | -20.3525 | 4.9035 | -29.9257 | -11.1951 | 0.0000 | 1.04 | 
| ypr[1] | 0.3317 | 0.3105 | 0.2920 | 0.8221 | -1.2516 | 1.9675 | 0.6440 | 1.03 | 
| ypr[2] | 1.1997 | 1.2103 | 1.1923 | 0.8127 | -0.2407 | 2.9797 | 0.9380 | 1.03 | 
| ypr[3] | 2.2590 | 2.2739 | 2.2575 | 0.8139 | 0.7930 | 4.0192 | 0.9970 | 1.02 | 
| ypr[4] | 3.4183 | 3.4317 | 3.4179 | 0.8209 | 1.9777 | 5.2350 | 1.0000 | 1.03 | 
| ypr[5] | 4.9959 | 4.9895 | 4.9714 | 0.8181 | 3.4314 | 6.6575 | 1.0000 | 1.03 | 
| ypr[6] | 4.5479 | 4.5939 | 4.5781 | 0.9122 | 2.8510 | 6.4027 | 1.0000 | 1.02 | 
| day | 0.3656 | 0.3669 | 0.3666 | 0.0471 | 0.2702 | 0.4574 | 1.0000 | 1.03 | 
| day2 | -0.0083 | -0.0083 | -0.0083 | 0.0009 | -0.0102 | -0.0065 | 0.0000 | 0.99 | 
| trt=hydroxychloroquine | 0.0146 | 0.0132 | 0.0140 | 0.0708 | -0.1190 | 0.1548 | 0.5748 | 0.98 | 
| day x f(y) | 0.3538 | 0.3560 | 0.3559 | 0.0626 | 0.2336 | 0.4760 | 1.0000 | 1.01 | 
| day2 x f(y) | -0.0073 | -0.0074 | -0.0074 | 0.0012 | -0.0096 | -0.0050 | 0.0000 | 0.95 | 
| ypr[1] x f(y) | 1.4566 | 1.4369 | 1.4428 | 0.9882 | -0.5276 | 3.2874 | 0.9285 | 0.99 | 
| ypr[2] x f(y) | 1.3604 | 1.3655 | 1.3598 | 0.9896 | -0.5180 | 3.3209 | 0.9132 | 1.00 | 
| ypr[3] x f(y) | 1.4934 | 1.5060 | 1.5046 | 0.9894 | -0.4427 | 3.3649 | 0.9382 | 0.98 | 
| ypr[4] x f(y) | 0.8827 | 0.9027 | 0.8946 | 0.9878 | -1.0460 | 2.7810 | 0.8170 | 0.98 | 
| ypr[5] x f(y) | 0.0410 | 0.0775 | 0.0692 | 1.0024 | -1.8831 | 1.9663 | 0.5275 | 1.00 | 
| ypr[6] x f(y) | 0.6509 | 0.6510 | 0.6322 | 1.0204 | -1.2392 | 2.6901 | 0.7375 | 0.99 | 
Now get state occupancy probabilities for observed days, then for all days. We must modify the uprobB function to handle the special nature of gap dependencies.
uprobB <- function(fit, data, times, ylevels, pvarname='yprev', gap=FALSE) {
    if(! length(fit$draws)) stop('fit does not have posterior draws')
    nd <- nrow(fit$draws)
  k  <- length(ylevels)
  s  <- length(times)
  P  <- array(NA, c(nd, s, k),
                        dimnames=list(paste('draw', 1 : nd), as.character(times), 
                                                    as.character(ylevels)))
  # Never uncondition on initial state
  wdata <- data
  wdata$day <- times[1]
  if(gap) {
    gp <- times[1]
    wdata$gap <- gp
    ypn <- as.numeric(as.character(wdata$yprev))
    X <- 1. * cbind(ypn == 2, ypn == 3, ypn == 4, ypn == 5, ypn == 6, ypn == 7)
    # wdata$yp  <- X
    wdata$ypr <- X * (gamma + exp(-lambda * gp))     # X / (gp ^ alpha)
  }
  p <- predict(fit, wdata, type='fitted.ind', posterior.summary='all')
  P[, 1, ] <- p
  # cp: matrix of conditional probabilities of Y conditioning on previous time Y
  # Columns = k conditional probabilities conditional on a single previous state
  # Rows    = all possible previous states
  # This is for a single posterior draw
  cp <- matrix(NA, nrow=k, ncol=k, 
               dimnames=list(paste('t-1', ylevels), paste('t', ylevels)))
  data <- as.list(data)
  data[[pvarname]] <- as.character(ylevels[-1])
  # don't request estimates after absorbing state
  # When data contains a matrix with one row, expand.grid will treat each column of that matrix
  # as a new row in the output.   We don't want that.  Instead, create data frame without
  # expand.grid and add ypr as a matrix to it later
  # edata <- expand.grid(data)
  edata <- as.data.frame(data)  
  
  for(it in 2 : s) {
    edata$day <- times[it]
    if(gap) {
      gp        <- times[it] - times[it - 1]
      edata$gap <- gp
      ypn       <- as.numeric(as.character(edata[[pvarname]]))
      X         <- 1. * matrix(c(ypn == 2, ypn == 3, ypn == 4, ypn == 5, ypn == 6, ypn == 7), ncol=6)
      # edata$yp  <- X
      edata$ypr <- X * (gamma + exp(-lambda * gp))   # X / (gp ^ alpha)
    }
    # y=first level = absorbing state
    pp <- predict(fit, edata, type='fitted.ind', posterior.summary='all')
    for(idraw in 1 : nd) {
        cp <- rbind(c(1., rep(0., k - 1)), pp[idraw, ,])
      # Compute unconditional probabilities of being in all possible states
      # at current time t
      P[idraw, it, ] <- t(cp) %*% P[idraw, it - 1, ]
    }
  }
  P
}
times <- c(1,2,3,4,7,14,28)
ylev <- 1 : 7
tlev <- levels(d$trt)
dat <- data.frame(yprev='4', age=repcov$age, sofa_nogcs=repcov$sofa_nogcs)
R   <- list()
for(trt in tlev) {   # separately for treatments
    dat$trt <- trt
    u <- uprobB(b, dat, times, ylev, pvarname='yprev', gap=TRUE)
    R[[trt]] <- u
}Look at state occupancy probabilities for placebo, summarized by the posterior mean probabilities.
sop <- round(apply(R[[1]], 2:3, mean), 2)
sop      1    2    3    4    5    6    7
1  0.00 0.02 0.14 0.70 0.11 0.02 0.00
2  0.00 0.06 0.16 0.53 0.18 0.07 0.00
3  0.01 0.09 0.13 0.40 0.21 0.15 0.01
4  0.02 0.11 0.10 0.29 0.19 0.26 0.02
7  0.04 0.10 0.07 0.20 0.14 0.37 0.07
14 0.06 0.09 0.03 0.11 0.05 0.31 0.36
28 0.08 0.06 0.02 0.06 0.03 0.25 0.50Now repeat, requesting probabilities for all consecutive days.
times <- 1 : 28
ylev <- 1 : 7
tlev <- levels(d$trt)
dat <- data.frame(yprev='4', age=repcov$age, sofa_nogcs=repcov$sofa_nogcs)
R   <- list()
for(trt in tlev) {   # separately for treatments
    dat$trt <- trt
    u <- uprobB(b, dat, times, ylev, pvarname='yprev', gap=TRUE)
    R[[trt]] <- u
}
round(apply(R[[1]], 2:3, mean), 2)      1    2    3    4    5    6    7
1  0.00 0.02 0.14 0.70 0.11 0.02 0.00
2  0.00 0.06 0.16 0.53 0.18 0.07 0.00
3  0.01 0.09 0.13 0.40 0.21 0.15 0.01
4  0.02 0.11 0.10 0.29 0.19 0.26 0.02
5  0.04 0.11 0.07 0.20 0.15 0.37 0.06
6  0.06 0.09 0.05 0.13 0.10 0.43 0.15
7  0.07 0.08 0.03 0.07 0.06 0.40 0.29
8  0.09 0.06 0.02 0.04 0.03 0.30 0.46
9  0.11 0.04 0.01 0.02 0.02 0.20 0.61
10 0.12 0.02 0.01 0.01 0.01 0.12 0.71
11 0.12 0.02 0.00 0.01 0.00 0.08 0.77
12 0.13 0.01 0.00 0.00 0.00 0.05 0.81
13 0.13 0.01 0.00 0.00 0.00 0.03 0.83
14 0.13 0.00 0.00 0.00 0.00 0.02 0.84
15 0.13 0.00 0.00 0.00 0.00 0.02 0.85
16 0.13 0.00 0.00 0.00 0.00 0.01 0.85
17 0.13 0.00 0.00 0.00 0.00 0.01 0.85
18 0.13 0.00 0.00 0.00 0.00 0.01 0.86
19 0.14 0.00 0.00 0.00 0.00 0.01 0.86
20 0.14 0.00 0.00 0.00 0.00 0.01 0.86
21 0.14 0.00 0.00 0.00 0.00 0.01 0.86
22 0.14 0.00 0.00 0.00 0.00 0.01 0.86
23 0.14 0.00 0.00 0.00 0.00 0.01 0.86
24 0.14 0.00 0.00 0.00 0.00 0.01 0.86
25 0.14 0.00 0.00 0.00 0.00 0.01 0.86
26 0.14 0.00 0.00 0.00 0.00 0.01 0.86
27 0.14 0.00 0.00 0.00 0.00 0.01 0.86
28 0.14 0.00 0.00 0.00 0.00 0.01 0.85Computing Environment
R version 4.1.0 (2021-05-18) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Pop!_OS 20.10 Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0 attached base packages: [1] splines stats4 stats graphics grDevices utils datasets [8] methods base other attached packages: [1] rstan_2.19.3 StanHeaders_2.21.0-1 VGAM_1.1-3 [4] plotly_4.9.3 data.table_1.13.0 rmsb_0.0.2 [7] rms_6.2-1 SparseM_1.78 Hmisc_4.6-0 [10] ggplot2_3.3.2 Formula_1.2-3 survival_3.2-11 [13] lattice_0.20-44To cite R in publications use:
R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
To cite the Hmisc package in publications use:Harrell Jr F (2021). Hmisc: Harrell Miscellaneous. R package version 4.6-0, https://hbiostat.org/R/Hmisc/.
To cite the rms package in publications use:Harrell Jr FE (2021). rms: Regression Modeling Strategies. https://hbiostat.org/R/rms/, https://github.com/harrelfe/rms.
To cite the rmsb package in publications use:Harrell F (2021). rmsb: Bayesian Regression Modeling Strategies. R package version 0.0.2, https://hbiostat.org/R/rmsb/.
To cite the VGAM package in publications use:Yee TW (2015). Vector Generalized Linear and Additive Models: With an Implementation in R. Springer, New York, USA.
To cite the data.table package in publications use:Dowle M, Srinivasan A (2020). data.table: Extension of 'data.frame'. R package version 1.13.0, https://CRAN.R-project.org/package=data.table.
To cite the ggplot2 package in publications use:Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
To cite the rstan package in publications use:Stan Development Team (2020). “RStan: the R interface to Stan.” R package version 2.19.3, http://mc-stan.org/.
To cite the survival package in publications use:Therneau T (2021). A Package for Survival Analysis in R. R package version 3.2-11, https://CRAN.R-project.org/package=survival.
- A singularity occurred when trying to allow - gapto be a linear spline.↩︎