Longitudinal Ordinal Analysis for Violet2

Introduction

VIOLET 2 was a randomized clinical trial of seriously ill adults in ICUs to study the effectiveness of vitamin D vs. placebo. It has the advantage of having daily ordinal outcomes assessed for 28 days. The original NEJM paper focused on 1078 patients with confirmed baseline vitamin D deficiency. The analyses presented here use 1352 of the original 1360 randomized patients. Of the 1078 patients, 72 of 539 placebo patients died by day 28, and 92 of 531 vitamin D patients died (proportions of 0.13 and 0.17). Our starting dataset has 1359 patients, with death counts of 89 and 112 (proportions 0.13 and 0.16).

See this and related analyses of the ORCHID study and some additional analysis of VIOLET 2. See this for a study of modeling irregular time spacings with Markov models. Go here for a list of papers about Markov longitudinal modeling of binary and ordinal outcomes.

require(rmsb)
knitrSet(lang='markdown', w=7, h=7, dev='png', fig.path='png/stat-')
alrsimdone <- TRUE
Load(violet2)
d <- violet2
omit <- c('Home', 'Dead', 'In Post Hospital facility')
k <- with(d, sum(day == 1 & bstatus %in% omit))
d <- subset(d, bstatus %nin% omit)
oldstatus <- d$status
levels(d$status) <-
  list(Dead='Dead',
       'Vent/ARDS' = c('On Vent', 'ARDS'),
       'In Hospital/Facility' = c('In Hospital',
                                  'In Post Hospital Facility'),
       Home = 'Home')
table(oldstatus, d$status)
##                            
## oldstatus                    Dead Vent/ARDS In Hospital/Facility  Home
##   Dead                       3749         0                    0     0
##   ARDS                          0       377                    0     0
##   On Vent                       0      2966                    0     0
##   In Hospital                   0         0                 8191     0
##   In Post Hospital Facility     0         0                 2937     0
##   Home                          0         0                    0 19586
d$bstatus        <- droplevels(d$bstatus)
label(d$day)     <- 'Day'
label(d$bstatus) <- 'Baseline Status'
dcf <- d
saveRDS(dcf, 'dcf.rds')
dd <- datadist(d); options(datadist='dd')
options(prType='html')

Descriptive Statistics

7 patients with day 0 status of dead, at home, or in post hospital facility were dropped.

html(describe(d))
d

12 Variables   37856 Observations

id: Reference Id
nmissingdistinct
3785601352
lowest :A01-00208A01-00210A01-00261A01-00337A01-00354
highest:W05-00581W05-00589W05-00590W05-00591W05-00592

bstatus: Baseline Status
image
nmissingdistinct
37828283
 Value             ARDS     On Vent In Hospital
 Frequency         2828       10444       24556
 Proportion       0.075       0.276       0.649
 

day: Day
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
378560280.99914.59.322 2.00 3.00 7.7514.5021.2526.0027.00
lowest : 1 2 3 4 5 , highest: 24 25 26 27 28
status
image
nmissingdistinct
37806504
 Value                      Dead            Vent/ARDS In Hospital/Facility
 Frequency                  3749                 3343                11128
 Proportion                0.099                0.088                0.294
                                
 Value                      Home
 Frequency                 19586
 Proportion                0.518
 

age: SCR age enrolled
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
37856078156.3118.1427.0033.0046.0058.0067.2576.0082.00
lowest : 18 19 20 21 22 , highest: 91 92 93 94 96
treatment
nmissingdistinct
3785602
 Value        Placebo Vitamin D
 Frequency      18676     19180
 Proportion     0.493     0.507
 

lips: Baseline LIPS Score
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
378560280.9965.1753.399 0 0 3 5 7 910
lowest : 0.0 1.0 1.5 2.0 2.5 , highest: 12.0 12.5 13.0 13.5 14.5
charlson: Baseline Charlson Score
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
366801176170.9873.8243.2420023589
lowest : 0 1 2 3 4 , highest: 12 13 14 15 18
 Value          0     1     2     3     4     5     6     7     8     9    10    11
 Frequency   4816  4312  4648  4704  4760  4312  2800  2072  1708   896   644   280
 Proportion 0.131 0.118 0.127 0.128 0.130 0.118 0.076 0.056 0.047 0.024 0.018 0.008
                                         
 Value         12    13    14    15    18
 Frequency    392   168   112    28    28
 Proportion 0.011 0.005 0.003 0.001 0.001
 

sofa: SOFA Total Score
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
378560190.9935.3824.028 0 1 3 5 81012
lowest : 0 1 2 3 4 , highest: 14 15 16 17 18
 Value          0     1     2     3     4     5     6     7     8     9    10    11
 Frequency   2436  3444  3136  3808  4424  3752  3780  2968  2492  2128  1820  1540
 Proportion 0.064 0.091 0.083 0.101 0.117 0.099 0.100 0.078 0.066 0.056 0.048 0.041
                                                     
 Value         12    13    14    15    16    17    18
 Frequency    896   476   308   168   140    84    56
 Proportion 0.024 0.013 0.008 0.004 0.004 0.002 0.001
 

vitD: LCMS Day 0
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
364001456554113.748.279 3.898 5.163 8.04712.60017.80023.31027.300
lowest : 0.0 0.6 0.8 1.3 1.5 , highest: 48.3 56.2 57.0 67.6 68.0
ddeath: Day of Death (29 if alive)
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
378560280.37526.234.918 5132929292929
lowest : 1 2 3 4 5 , highest: 25 26 27 28 29
dhome: Day Discharged to Home (NA if never)
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
2679611060280.9948.1996.851 2 2 4 6111825
lowest : 1 2 3 4 5 , highest: 24 25 26 27 28

Summarize deaths occurring after a patient arrives at home (there were no other events).

u <- subset(d, ! is.na(dhome) & day >= dhome)
np <- length(unique(u$id))
ud <- subset(u, status == 'Dead')
nd <- length(unique(ud$id))
cat(np, 'patients discharged to home\n',
    nd, 'died post discharge\n')
957 patients discharged to home
 29 died post discharge

Event Chart for Random Sample of Patients

A status timeline is shown for each of 65 randomly chosen patients. The patients are sorted so that the worst outcomes at the last follow-up day are at the top. One day is subtract from the time variable to make follow-up start at day 0.

ssamp <- sample(unique(d$id), 65, FALSE)
dr        <- subset(d, id %in% ssamp)
dr        <- subset(dr, day <= ddeath)
dr$id     <- as.integer(as.factor(dr$id))
dr$status <- factor(dr$status, levels=rev(levels(dr$status)))
dr$day    <- dr$day - 1
multEventChart(status ~ day + id, data=dr,
               absorb='Dead', sortbylast = TRUE) +
  theme_classic() +
  theme(legend.position='bottom')

Summary of Successive State Transitions

propsTrans(status ~ day + id, data=d, maxsize=6) +
  theme(legend.position='bottom', axis.text.x=element_text(angle=90, hjust=1))
# For a nice event chart for this type of data see https://livefreeordichotomize.com/2020/05/21/survival-model-detective-1

Category Proportions Over Time and PO Assumption

The following plot shows how the mix of outcomes changes over time stratified also by treatment.

propsPO(status ~ treatment + day, data=subset(d, day < 15), nrow=1) +
  theme(legend.position='bottom', axis.text.x=element_text(angle=90, hjust=1))

propsPO(status ~ treatment + day, data=subset(d, day > 14), nrow=1) +
  theme(legend.position='bottom', axis.text.x=element_text(angle=90, hjust=1))

Ignoring treatment, compare the observed patterns with what would be expected had the proportional odds assumption been satisfied, fit a PO model containing only day as a predictor, and allow the day effect to be a flexible spline function. Then compute odds ratios against day 1 and apply those odds ratios to the observed proportions from day 1. This is for the original dataset with death carried forward.

f <- lrm(status ~ rcs(day, 5), data=d)
g <- Function(f)   # derive predicted log odds as a function of day
# Make another function that gets the OR vs. day=1
dor <- function(day) exp(g(day) - g(1))
propsPO(status ~ day, odds.ratio=dor, data=d) +
  theme(legend.position='bottom')

Carrying death forward results in a devastating violation of the PO assumption with respect to time. For example, the incidence of death keeps increasing while the incidence of vent/ARDS is decreasing. Now we repeat the process but terminating follow-up at death.

u <- subset(d, day <= ddeath)
f <- lrm(status ~ rcs(day, 5), data=u)
g <- Function(f)   # derive predicted log odds as a function of day
dor <- function(day) exp(g(day) - g(1))
propsPO(status ~ day, odds.ratio=dor, data=u) +
  theme(legend.position='bottom')

The non-PO with respect to time is less problematic when follow-up stops at death but continues for nonfatal events. Note that in the limiting case where death is the only event and the time intervals are small, such truncation results in a binary logistic odds ratio for treatment being almost exactly the hazard ratio in a Cox model analysis when interval risks are small. [However the absolute risk estimates from such a model as a function of time are hard to interpret.].

Note: The lower panel of the above graph is dependent on the choice of reference day. The proportions were estimated from day 1 alone. Instead let’s estimate them from day 4 and repeat.

dor <- function(day) exp(g(day) - g(4))
propsPO(status ~ day, odds.ratio=dor, ref=4, data=u) +
  theme(legend.position='bottom')

Now proportional odds holds with respect to time.

Now do a similar analysis for the second most important predictor: baseline status, still truncating at death. ARDS is taken as the reference group for computing proportions in the follow-up status categories for the purpose of applying the PO assumption.

f <- lrm(status ~ bstatus, data=u)
g <- Function(f)   # derive predicted log odds as a function of bstatus
dor <- function(bstatus) exp(g(bstatus) - g('ARDS'))
propsPO(status ~ bstatus, odds.ratio=dor, ref='ARDS', data=u) 

Proportional odds is reasonably well satisfied with regard to baseline status except that for patients in hospital at baseline, without ARDS or vent, there is an overprediction of the proportion in the vent/ARDS category. Because these summaries are confounded with time, let’s repeat for day 3 only.

u3 <- subset(u, day == 3)
f <- lrm(status ~ bstatus, data=u3)
g <- Function(f)   # derive predicted log odds as a function of bstatus
dor <- function(bstatus) exp(g(bstatus) - g('ARDS'))
propsPO(status ~ bstatus, odds.ratio=dor, ref='ARDS', data=u3) 

See here for assessment of PO for other variables.

GEE-Type Proportional Odds Modeling

This analysis uses a working independence proportional odds model, then the variance-covariance matrix is corrected for within-patient correlation using the robust cluster sandwhich covariance estimator. Covariates are allowed to have nonlinear effects through the use of restricted cubic spline functions with 4 knots. A dot charts shows the relative importance of the predictors. Note that the lrm function in the R rms package frames the model in terms of \(P(Y \geq y | X)\) whereas most software uses \(P(Y \leq y | X)\).

There are several patients excluded in the analysis below due to missing baseline vitamin D levels.

f <- lrm(status ~ bstatus + rcs(day, 5) + rcs(age, 5) + rcs(lips, 5) +
             rcs(charlson, 4) + rcs(sofa, 5) + rcs(vitD, 5) + treatment,
         data=d, tol=1e-13, x=TRUE, y=TRUE)
g <- robcov(f, d$id)
plot(anova(g), trans=sqrt)

We reallocate degrees of freedom for the terms in the model so that the predictors with the most potential from the dot chart are given more parameters. This is because complex relationships that are week will not suffer from underfitting. The likelihood ratio and score \(\chi^2\) statistics in the output below should be ignored because they are not adjusted for intra-cluster correlation.

f <- lrm(status ~ bstatus + rcs(day, 5) + rcs(age, 3) + rcs(lips, 3) +
             rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 3) + treatment,
         data=d, x=TRUE, y=TRUE)
g <- robcov(f, d$id)
g

Logistic Regression Model

 lrm(formula = status ~ bstatus + rcs(day, 5) + rcs(age, 3) + 
     rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 
     3) + treatment, data = d, x = TRUE, y = TRUE)
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                 3423                 3043                10467 
                 Home 
                18353 
 

Frequencies of Missing Values Due to Each Variable

Model Likelihood
Ratio Test
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 35286 LR χ2 9851.12 R2 0.271 C 0.749
Cluster on d$id d.f. 19 g 1.208 Dxy 0.498
Clusters 1261 Pr(>χ2) <0.0001 gr 3.348 γ 0.498
max |∂log L/∂β| 1×10-10 gp 0.158 τa 0.311
Brier 0.122
β S.E. Wald Z Pr(>|Z|)
y≥Vent/ARDS   0.5629  0.4329 1.30 0.1935
y≥In Hospital/Facility  -0.3008  0.4376 -0.69 0.4918
y≥Home  -2.1195  0.4404 -4.81 <0.0001
bstatus=On Vent   0.4439  0.2116 2.10 0.0359
bstatus=In Hospital   1.4338  0.2179 6.58 <0.0001
day   0.1833  0.0100 18.39 <0.0001
day’  -0.3621  0.0446 -8.12 <0.0001
day’’   0.5972  0.1172 5.10 <0.0001
day’’’  -0.1138  0.1323 -0.86 0.3897
age   0.0013  0.0081 0.16 0.8703
age’  -0.0153  0.0075 -2.04 0.0414
lips  -0.0397  0.0456 -0.87 0.3837
lips’  -0.0151  0.0469 -0.32 0.7483
charlson  -0.1797  0.1106 -1.63 0.1041
charlson’   0.3888  0.5018 0.77 0.4384
charlson’’  -0.7933  0.9951 -0.80 0.4253
sofa   0.0212  0.0756 0.28 0.7791
sofa’  -0.1943  0.2564 -0.76 0.4485
sofa’’   0.4016  1.0363 0.39 0.6984
vitD   0.0595  0.0175 3.41 0.0007
vitD’  -0.0584  0.0201 -2.91 0.0037
treatment=Vitamin D   0.1169  0.0963 1.21 0.2249
anova(g)
Wald Statistics for status
χ2 d.f. P
bstatus 79.00 2 <0.0001
day 482.46 4 <0.0001
Nonlinear 282.09 3 <0.0001
age 12.77 2 0.0017
Nonlinear 4.16 1 0.0414
lips 9.91 2 0.0071
Nonlinear 0.10 1 0.7483
charlson 26.63 3 <0.0001
Nonlinear 0.66 2 0.7175
sofa 27.70 3 <0.0001
Nonlinear 4.75 2 0.0932
vitD 12.27 2 0.0022
Nonlinear 8.44 1 0.0037
treatment 1.47 1 0.2249
TOTAL NONLINEAR 293.51 10 <0.0001
TOTAL 786.13 19 <0.0001

Partial effect plots below show the relationship between each predictor and log odds, holding other predictors to median or mode.

ggplot(Predict(g))

A nomogram representation of the model is below. The nomogram can be enhanced to show the estimated probability or outcome better than any given level.

n <- nomogram(g, vnames='names')
plot(n)

Fit With Only Linear Effects and Only for Day 14

d14 <- subset(d, day == 14)
f <- lrm(status ~ bstatus + age + lips +
             charlson + sofa + treatment, data=d14, x=TRUE, y=TRUE)
f

Logistic Regression Model

 lrm(formula = status ~ bstatus + age + lips + charlson + sofa + 
     treatment, data = d14, x = TRUE, y = TRUE)
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  142                   77                  303 
                 Home 
                  786 
 
Frequencies of Missing Values Due to Each Variable
    status   bstatus       age      lips  charlson      sofa treatment 
         2         1         0         0        42         0         0 
 
Model Likelihood
Ratio Test
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 1308 LR χ2 252.94 R2 0.200 C 0.712
max |∂log L/∂β| 9×10-7 d.f. 7 g 1.050 Dxy 0.424
Pr(>χ2) <0.0001 gr 2.858 γ 0.424
gp 0.139 τa 0.242
Brier 0.116
β S.E. Wald Z Pr(>|Z|)
y≥Vent/ARDS   3.1451  0.3644 8.63 <0.0001
y≥In Hospital/Facility   2.5659  0.3605 7.12 <0.0001
y≥Home   1.1570  0.3551 3.26 0.0011
bstatus=On Vent   0.4889  0.2205 2.22 0.0266
bstatus=In Hospital   1.3809  0.2197 6.28 <0.0001
age  -0.0115  0.0045 -2.57 0.0103
lips  -0.0487  0.0196 -2.49 0.0129
charlson  -0.1248  0.0241 -5.19 <0.0001
sofa  -0.0947  0.0180 -5.28 <0.0001
treatment=Vitamin D   0.2483  0.1156 2.15 0.0317
anova(f)
Wald Statistics for status
χ2 d.f. P
bstatus 65.58 2 <0.0001
age 6.59 1 0.0103
lips 6.19 1 0.0129
charlson 26.89 1 <0.0001
sofa 27.85 1 <0.0001
treatment 4.61 1 0.0317
TOTAL 225.54 7 <0.0001

Evaluation of Baseline Measurement \(\times\) Time Interactions

In the following, allow time (simplified to 3 knots in the spline) to interact with each of the baseline variables, using fewer knots than before because of the much larger number of parameters being fitted.

f <- lrm(status ~ rcs(day, 3) * (bstatus + rcs(age, 3) + rcs(lips, 3) +
             rcs(charlson, 3) + rcs(sofa, 3) + treatment),
                 data=d, tol=1e-13, x=TRUE, y=TRUE)
g <- robcov(f, d$id)
anova(g)
Wald Statistics for status
χ2 d.f. P
day (Factor+Higher Order Factors) 560.38 24 <0.0001
All Interactions 124.64 22 <0.0001
Nonlinear (Factor+Higher Order Factors) 295.79 12 <0.0001
bstatus (Factor+Higher Order Factors) 238.83 6 <0.0001
All Interactions 4.41 4 0.3529
age (Factor+Higher Order Factors) 11.07 6 0.0861
All Interactions 8.80 4 0.0663
Nonlinear (Factor+Higher Order Factors) 4.21 3 0.2400
lips (Factor+Higher Order Factors) 22.96 6 0.0008
All Interactions 2.65 4 0.6174
Nonlinear (Factor+Higher Order Factors) 1.40 3 0.7059
charlson (Factor+Higher Order Factors) 35.46 6 <0.0001
All Interactions 25.45 4 <0.0001
Nonlinear (Factor+Higher Order Factors) 4.35 3 0.2265
sofa (Factor+Higher Order Factors) 53.37 6 <0.0001
All Interactions 16.99 4 0.0019
Nonlinear (Factor+Higher Order Factors) 7.09 3 0.0692
treatment (Factor+Higher Order Factors) 5.64 3 0.1303
All Interactions 4.60 2 0.1001
day × bstatus (Factor+Higher Order Factors) 4.41 4 0.3529
Nonlinear 1.71 2 0.4261
Nonlinear Interaction : f(A,B) vs. AB 1.71 2 0.4261
day × age (Factor+Higher Order Factors) 8.80 4 0.0663
Nonlinear 7.32 3 0.0624
Nonlinear Interaction : f(A,B) vs. AB 7.32 3 0.0624
f(A,B) vs. Af(B) + Bg(A) 3.49 1 0.0618
Nonlinear Interaction in day vs. Af(B) 6.75 2 0.0343
Nonlinear Interaction in age vs. Bg(A) 4.12 2 0.1277
day × lips (Factor+Higher Order Factors) 2.65 4 0.6174
Nonlinear 1.72 3 0.6329
Nonlinear Interaction : f(A,B) vs. AB 1.72 3 0.6329
f(A,B) vs. Af(B) + Bg(A) 0.86 1 0.3528
Nonlinear Interaction in day vs. Af(B) 1.27 2 0.5309
Nonlinear Interaction in lips vs. Bg(A) 1.30 2 0.5230
day × charlson (Factor+Higher Order Factors) 25.45 4 <0.0001
Nonlinear 5.33 3 0.1495
Nonlinear Interaction : f(A,B) vs. AB 5.33 3 0.1495
f(A,B) vs. Af(B) + Bg(A) 0.49 1 0.4833
Nonlinear Interaction in day vs. Af(B) 5.18 2 0.0752
Nonlinear Interaction in charlson vs. Bg(A) 0.70 2 0.7042
day × sofa (Factor+Higher Order Factors) 16.99 4 0.0019
Nonlinear 14.80 3 0.0020
Nonlinear Interaction : f(A,B) vs. AB 14.80 3 0.0020
f(A,B) vs. Af(B) + Bg(A) 1.38 1 0.2400
Nonlinear Interaction in day vs. Af(B) 12.02 2 0.0025
Nonlinear Interaction in sofa vs. Bg(A) 6.19 2 0.0453
day × treatment (Factor+Higher Order Factors) 4.60 2 0.1001
Nonlinear 4.01 1 0.0451
Nonlinear Interaction : f(A,B) vs. AB 4.01 1 0.0451
TOTAL NONLINEAR 305.41 20 <0.0001
TOTAL INTERACTION 124.64 22 <0.0001
TOTAL NONLINEAR + INTERACTION 366.25 27 <0.0001
TOTAL 927.04 35 <0.0001

From the ANOVA above you can see strong evidence for some form of interactions. Interactions between time and baseline SOFA score and time and the Charlson index stand out. The interactions indicate that the effect of these two baseline variables dissipates at later times. Let’s estimate the baseline covariate effects for days 1, 8, 15, 22 from the interaction model.

days <- c(1, 8, 15, 22)
ggplot(Predict(g, bstatus, day=days))
ggplot(Predict(g, age, day=days))
ggplot(Predict(g, lips, day=days))
ggplot(Predict(g, charlson, day=days))
ggplot(Predict(g, sofa, day=days))
ggplot(Predict(g, treatment, day=days))
ggplot(Predict(g, day, treatment))

Checking Proportional Odds Assumption

Before getting into a more detailed analysis concerning the proportional odds assumption involving multiple predictors, let’s take a different look at time alone (the approach described earlier is more intuitive). For each day, compute all cumulative proportions, take their logit, and see how the spacing between successive logit cumulative proportions varies across days. Plotting symbols are defined in the code. This may be easier to see in a table of simple cell frequencies that is also shown.

d$yn <- as.integer(d$status)   # numeric Y for easy dichotomization
table(d$status, d$yn)
                      
                           1     2     3     4
  Dead                  3749     0     0     0
  Vent/ARDS                0  3343     0     0
  In Hospital/Facility     0     0 11128     0
  Home                     0     0     0 19586
ql <- function(y) {
  p <- min(0.9999, max(0.0001, mean(y, na.rm=TRUE)))
  qlogis(p)
}
sf <- function(y) c(
  'y>=2'=ql(y >= 2),  # box      (alive)
  'y>=3'=ql(y >= 3),  # circle   (not dead or Vent/ARDS)
  'y>=4'=ql(y == 4))  # triangle (discharged)
s <- summary(yn ~ day, fun=sf, continuous=99, data=d)
plot(s, which=1:3, pch=0:4, xlab='logit Cumulative Prob', main='')
# g <- function(y)
#   apply(y, 2, function(u) qlogis(mean(u)))
# s <- summary(cumcategory(yn) ~ day, continuous=99, data=d)
# plot(s, which=2:3, pch=1:5, xlab='logit Cumulative Prob.', main='')
with(d, print(table(day, status)))
    status
day  Dead Vent/ARDS In Hospital/Facility Home
  1    16       421                  877   37
  2    31       319                  860  141
  3    49       258                  809  235
  4    63       226                  727  335
  5    76       202                  634  439
  6    84       169                  586  512
  7    96       153                  521  580
  8   107       130                  469  644
  9   113       123                  423  691
  10  121       115                  405  709
  11  125       104                  387  734
  12  135        93                  357  765
  13  140        87                  334  789
  14  147        82                  311  810
  15  151        79                  297  823
  16  157        78                  280  835
  17  162        72                  275  841
  18  166        69                  266  849
  19  167        67                  258  858
  20  170        66                  251  863
  21  170        66                  247  867
  22  173        63                  241  873
  23  178        62                  232  878
  24  182        62                  223  883
  25  186        61                  217  886
  26  193        59                  211  887
  27  195        57                  204  894
  28  196         0                  226  928

In the following we ignore intra-patient correlation and use an all-linear (except for time) PO model for purposes of assessing the PO assumption on all covariates including time. The assessment is done by comparing \(\beta\)s with those from the PO model overall fit and those from all dichotomizations of Y. The analysis is done twice. First all patient records are considered. Then days after the day of death are removed. We pay more attention to the more important variables: day, baseline status, SOFA, and Charlson (in descending order of outcome variation explained). The day effect is extremely nonlinear in the PO model, and day is modeled as a quadratic effect here so it has two \(\beta\)s.

for(inclall in c(TRUE, FALSE)) {
  cat(if(inclall) '\nAll records\n\n' else '\nRecords until death\n\n')
  u <- if(inclall) d else subset(d, day <= ddeath)
  f <- lrm(status ~ bstatus + pol(day, 2) + age + lips + charlson + sofa +
             treatment, data=u)
  k <- matrix(NA, nrow=4, ncol=9)
  poc <- coef(f)[-(1:3)]
  colnames(k) <- names(poc)
  rownames(k) <- c('PO', paste0('>=', 2:4))
  k[1, ] <- poc
  for(cut in c(2 : 4)) {
    g <- lrm(yn >= cut ~ bstatus + pol(day, 2) + age + lips + charlson +
               sofa + treatment, data=u)
    k[cut, ] <- coef(g)[-1]
  }
  print(round(k, 3))
}

All records

    bstatus=On Vent bstatus=In Hospital    day  day^2    age   lips charlson
PO            0.464               1.465  0.171 -0.004 -0.009 -0.051   -0.113
>=2           0.582               1.053 -0.183  0.004 -0.004 -0.106   -0.207
>=3           0.410               2.334  0.096 -0.003 -0.001 -0.047   -0.160
>=4           0.364               1.170  0.322 -0.008 -0.014 -0.042   -0.105
      sofa treatment=Vitamin D
PO  -0.089               0.154
>=2 -0.123               0.022
>=3 -0.076               0.046
>=4 -0.098               0.205

Records until death

    bstatus=On Vent bstatus=In Hospital   day  day^2    age   lips charlson
PO            0.326               1.600 0.284 -0.006 -0.015 -0.021   -0.045
>=2           0.490               0.772 0.080 -0.001 -0.006 -0.098   -0.177
>=3           0.271               3.868 0.237 -0.005 -0.003 -0.003   -0.044
>=4           0.193               1.001 0.345 -0.008 -0.018 -0.021   -0.049
      sofa treatment=Vitamin D
PO  -0.075               0.217
>=2 -0.100              -0.035
>=3 -0.034               0.102
>=4 -0.085               0.255

Constancy of the \(\beta\)s vertically in the above tables, and hence equality with the PO \(\beta\)s in the first row, means the PO assumption is satisfied. Not that PO is not strictly satisfied for any dataset we encounter, and even when PO is not satisfied, the PO model can still work better than alternative approaches.

When all records are analyzed (even those with duplicated death status post death), the coefficients are slightly more constant, except for the coefficient of day. When death-carried-forward is not used, the coefficients are slightly less constant except for the effect of day. To help explore that, here is an analysis of when deaths are occuring (patients not dying are removed):

w <- subset(d, day == 1 & ddeath < 29)
hist(w$ddeath, breaks=0:28, main='', xlab='Day of Death')

GEE-Like Analysis Repeated Without Carrying Death Forward

The last analysis with cluster correlation correction is repeated after deletion of patient records beyond the day of death. In other words, the death status is not repeated up to day 28 but the patient data are truncated at death.

d <- dnc <- subset(d, day <= ddeath)
f <- lrm(status ~ bstatus + rcs(day, 5) + rcs(age, 3) + rcs(lips, 3) +
             rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 3) + treatment,
         data=d, x=TRUE, y=TRUE)
g <- robcov(f, d$id)
g

Logistic Regression Model

 lrm(formula = status ~ bstatus + rcs(day, 5) + rcs(age, 3) + 
     rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 
     3) + treatment, data = d, x = TRUE, y = TRUE)
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  180                 3043                10467 
                 Home 
                18353 
 

Frequencies of Missing Values Due to Each Variable

Model Likelihood
Ratio Test
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 32043 LR χ2 11418.02 R2 0.354 C 0.787
Cluster on d$id d.f. 19 g 1.506 Dxy 0.574
Clusters 1261 Pr(>χ2) <0.0001 gr 4.506 γ 0.575
max |∂log L/∂β| 1×10-7 gp 0.119 τa 0.319
Brier 0.069
β S.E. Wald Z Pr(>|Z|)
y≥Vent/ARDS   2.7293  0.4924 5.54 <0.0001
y≥In Hospital/Facility  -0.6462  0.5038 -1.28 0.1996
y≥Home  -3.2094  0.4970 -6.46 <0.0001
bstatus=On Vent   0.3845  0.2626 1.46 0.1430
bstatus=In Hospital   1.6675  0.2571 6.49 <0.0001
day   0.3020  0.0127 23.72 <0.0001
day’  -0.5754  0.0647 -8.89 <0.0001
day’’   0.9879  0.1814 5.45 <0.0001
day’’’  -0.2407  0.2103 -1.14 0.2523
age   0.0018  0.0094 0.19 0.8477
age’  -0.0187  0.0083 -2.26 0.0240
lips  -0.0468  0.0519 -0.90 0.3676
lips’   0.0213  0.0469 0.45 0.6492
charlson  -0.1732  0.1384 -1.25 0.2106
charlson’   0.6995  0.7050 0.99 0.3211
charlson’’  -1.6179  1.6182 -1.00 0.3174
sofa   0.0392  0.1066 0.37 0.7129
sofa’  -0.3083  0.3890 -0.79 0.4280
sofa’’   0.6236  0.9470 0.66 0.5102
vitD   0.0413  0.0193 2.14 0.0322
vitD’  -0.0494  0.0229 -2.16 0.0309
treatment=Vitamin D   0.1920  0.1063 1.81 0.0708
anova(g)
Wald Statistics for status
χ2 d.f. P
bstatus 108.29 2 <0.0001
day 1401.68 4 <0.0001
Nonlinear 391.54 3 <0.0001
age 17.32 2 0.0002
Nonlinear 5.09 1 0.0240
lips 1.86 2 0.3949
Nonlinear 0.21 1 0.6492
charlson 6.47 3 0.0908
Nonlinear 1.00 2 0.6063
sofa 19.39 3 0.0002
Nonlinear 1.91 2 0.3856
vitD 4.79 2 0.0913
Nonlinear 4.66 1 0.0309
treatment 3.26 1 0.0708
TOTAL NONLINEAR 403.82 10 <0.0001
TOTAL 1621.75 19 <0.0001
ggplot(Predict(g))

Let’s also look at the proportional odds assumption using these truncated records (death not carried forward).

s <- summary(yn ~ day, fun=sf, continuous=99, data=d)
plot(s, which=1:3, pch=0:4, xlab='logit Cumulative Prob', main='')

When the absorbing state is not carried forward, the problem with non-proportional odds almost vanishes.

Finally, look at interactions with time in this model.

f <- lrm(status ~ rcs(day, 3) * (bstatus + rcs(age, 3) + rcs(lips, 3) +
             rcs(charlson, 3) + rcs(sofa, 3) + treatment),
                 data=d, tol=1e-13, x=TRUE, y=TRUE)
g <- robcov(f, d$id)
anova(g)
Wald Statistics for status
χ2 d.f. P
day (Factor+Higher Order Factors) 1195.35 24 <0.0001
All Interactions 87.91 22 <0.0001
Nonlinear (Factor+Higher Order Factors) 400.49 12 <0.0001
bstatus (Factor+Higher Order Factors) 365.19 6 <0.0001
All Interactions 19.95 4 0.0005
age (Factor+Higher Order Factors) 23.69 6 0.0006
All Interactions 20.58 4 0.0004
Nonlinear (Factor+Higher Order Factors) 5.92 3 0.1154
lips (Factor+Higher Order Factors) 26.36 6 0.0002
All Interactions 10.98 4 0.0268
Nonlinear (Factor+Higher Order Factors) 2.62 3 0.4542
charlson (Factor+Higher Order Factors) 15.46 6 0.0170
All Interactions 6.11 4 0.1914
Nonlinear (Factor+Higher Order Factors) 2.17 3 0.5381
sofa (Factor+Higher Order Factors) 46.36 6 <0.0001
All Interactions 1.64 4 0.8011
Nonlinear (Factor+Higher Order Factors) 4.72 3 0.1931
treatment (Factor+Higher Order Factors) 4.66 3 0.1984
All Interactions 1.91 2 0.3852
day × bstatus (Factor+Higher Order Factors) 19.95 4 0.0005
Nonlinear 12.96 2 0.0015
Nonlinear Interaction : f(A,B) vs. AB 12.96 2 0.0015
day × age (Factor+Higher Order Factors) 20.58 4 0.0004
Nonlinear 8.33 3 0.0397
Nonlinear Interaction : f(A,B) vs. AB 8.33 3 0.0397
f(A,B) vs. Af(B) + Bg(A) 4.22 1 0.0399
Nonlinear Interaction in day vs. Af(B) 6.98 2 0.0305
Nonlinear Interaction in age vs. Bg(A) 5.87 2 0.0532
day × lips (Factor+Higher Order Factors) 10.98 4 0.0268
Nonlinear 2.81 3 0.4222
Nonlinear Interaction : f(A,B) vs. AB 2.81 3 0.4222
f(A,B) vs. Af(B) + Bg(A) 0.01 1 0.9409
Nonlinear Interaction in day vs. Af(B) 0.25 2 0.8805
Nonlinear Interaction in lips vs. Bg(A) 2.41 2 0.2997
day × charlson (Factor+Higher Order Factors) 6.11 4 0.1914
Nonlinear 4.78 3 0.1888
Nonlinear Interaction : f(A,B) vs. AB 4.78 3 0.1888
f(A,B) vs. Af(B) + Bg(A) 0.06 1 0.8138
Nonlinear Interaction in day vs. Af(B) 3.92 2 0.1405
Nonlinear Interaction in charlson vs. Bg(A) 1.31 2 0.5196
day × sofa (Factor+Higher Order Factors) 1.64 4 0.8011
Nonlinear 1.46 3 0.6922
Nonlinear Interaction : f(A,B) vs. AB 1.46 3 0.6922
f(A,B) vs. Af(B) + Bg(A) 0.01 1 0.9297
Nonlinear Interaction in day vs. Af(B) 1.42 2 0.4911
Nonlinear Interaction in sofa vs. Bg(A) 0.04 2 0.9792
day × treatment (Factor+Higher Order Factors) 1.91 2 0.3852
Nonlinear 1.86 1 0.1728
Nonlinear Interaction : f(A,B) vs. AB 1.86 1 0.1728
TOTAL NONLINEAR 420.41 20 <0.0001
TOTAL INTERACTION 87.91 22 <0.0001
TOTAL NONLINEAR + INTERACTION 455.33 27 <0.0001
TOTAL 1637.55 35 <0.0001
days <- c(1, 8, 15, 22)
ggplot(Predict(g, bstatus, day=days))
ggplot(Predict(g, age, day=days))
ggplot(Predict(g, lips, day=days))
ggplot(Predict(g, charlson, day=days))
ggplot(Predict(g, sofa, day=days))
ggplot(Predict(g, treatment, day=days))
ggplot(Predict(g, day, treatment))

Assessment of Interaction Between Baseline Vitamin D and Treatment

We repeat the model above but allowing baseline vitamin D to interact with treatment.

f <- lrm(status ~ bstatus + rcs(day, 5) + rcs(age, 3) + rcs(lips, 3) +
             rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 4) * treatment,
         data=d, x=TRUE, y=TRUE)
g <- robcov(f, d$id)
g

Logistic Regression Model

 lrm(formula = status ~ bstatus + rcs(day, 5) + rcs(age, 3) + 
     rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 
     4) * treatment, data = d, x = TRUE, y = TRUE)
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  180                 3043                10467 
                 Home 
                18353 
 

Frequencies of Missing Values Due to Each Variable

Model Likelihood
Ratio Test
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 32043 LR χ2 11633.44 R2 0.360 C 0.790
Cluster on d$id d.f. 23 g 1.524 Dxy 0.580
Clusters 1261 Pr(>χ2) <0.0001 gr 4.592 γ 0.580
max |∂log L/∂β| 1×10-7 gp 0.120 τa 0.322
Brier 0.069
β S.E. Wald Z Pr(>|Z|)
y≥Vent/ARDS   2.3634  0.5685 4.16 <0.0001
y≥In Hospital/Facility  -1.0222  0.5765 -1.77 0.0762
y≥Home  -3.6015  0.5723 -6.29 <0.0001
bstatus=On Vent   0.4329  0.2622 1.65 0.0987
bstatus=In Hospital   1.7149  0.2571 6.67 <0.0001
day   0.3041  0.0128 23.70 <0.0001
day’  -0.5816  0.0650 -8.94 <0.0001
day’’   1.0016  0.1823 5.49 <0.0001
day’’’  -0.2506  0.2113 -1.19 0.2355
age   0.0029  0.0094 0.31 0.7600
age’  -0.0193  0.0083 -2.33 0.0197
lips  -0.0507  0.0521 -0.97 0.3312
lips’   0.0260  0.0472 0.55 0.5823
charlson  -0.1829  0.1393 -1.31 0.1893
charlson’   0.7225  0.7116 1.02 0.3099
charlson’’  -1.6651  1.6355 -1.02 0.3086
sofa   0.0355  0.1074 0.33 0.7410
sofa’  -0.2841  0.3896 -0.73 0.4659
sofa’’   0.5528  0.9472 0.58 0.5595
vitD   0.0686  0.0505 1.36 0.1742
vitD’  -0.0298  0.2232 -0.13 0.8937
vitD’’  -0.1455  0.5537 -0.26 0.7928
treatment=Vitamin D   0.7066  0.5397 1.31 0.1905
vitD × treatment=Vitamin D  -0.0303  0.0757 -0.40 0.6894
vitD’ × treatment=Vitamin D  -0.2233  0.3288 -0.68 0.4971
vitD’’ × treatment=Vitamin D   0.8314  0.8141 1.02 0.3071
anova(g)
Wald Statistics for status
χ2 d.f. P
bstatus 110.06 2 <0.0001
day 1405.28 4 <0.0001
Nonlinear 388.96 3 <0.0001
age 17.09 2 0.0002
Nonlinear 5.44 1 0.0197
lips 1.84 2 0.3994
Nonlinear 0.30 1 0.5823
charlson 7.13 3 0.0678
Nonlinear 1.04 2 0.5954
sofa 20.01 3 0.0002
Nonlinear 2.05 2 0.3594
vitD (Factor+Higher Order Factors) 19.97 6 0.0028
All Interactions 13.72 3 0.0033
Nonlinear (Factor+Higher Order Factors) 19.95 4 0.0005
treatment (Factor+Higher Order Factors) 17.48 4 0.0016
All Interactions 13.72 3 0.0033
vitD × treatment (Factor+Higher Order Factors) 13.72 3 0.0033
Nonlinear 13.68 2 0.0011
Nonlinear Interaction : f(A,B) vs. AB 13.68 2 0.0011
TOTAL NONLINEAR 407.50 13 <0.0001
TOTAL NONLINEAR + INTERACTION 410.86 14 <0.0001
TOTAL 1634.16 23 <0.0001

There is evidence for interaction. Plot the interaction effect.

ggplot(Predict(g, vitD, treatment))

To get the right confidence bands, form contrasts.

k <- contrast(g, list(treatment='Vitamin D', vitD=0:65),
                 list(treatment='Placebo', vitD=0:65))
k <- as.data.frame(k[c('vitD', 'Contrast', 'Lower', 'Upper')])
ggplot(k, aes(x=vitD, y=Contrast)) + geom_point() + 
  geom_errorbar(aes(ymin=Lower, ymax=Upper), width=0)

There seems to be benefit of treatment in the very low range of baseline vitamin D and for those patients with high baseline vitamin D, but not in the middle.

Bayesian Random Effects Model Without Carrying Death Forward

Repeat the last model but using a Bayesian approach with normally distributed random effects for patient level intercepts. This is a compound symmetry type of correlation structure within patient. Before proceeding witih the full model try to find the smallest model that exhibits convergence problems that will be seen later with the full model.

stanSet()
# This is fine:
bnocsnoc <- blrm(status ~ age, data=d, loo=FALSE, refresh=100,
                 file='bnocsnoc.rds')
stanDx(bnocsnoc)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
                        n_eff  Rhat
y>=Vent/ARDS             4146 1.000
y>=In Hospital/Facility  4713 1.000
y>=Home                  4700 0.999
age                      4419 0.999
stanDxplot(bnocsnoc, which='ALL')

# Problem with this one if rsdmean=0.3 or 0.08, minor prob if 0.005
bnocs <- blrm(status ~ age + cluster(id), psigma=1, rsdsd=0.075,
              data=d, loo=FALSE, refresh=100, file='bnocs.rds')
# Previous: psigma=2 rsdmean=0.005; psigma=1 rsdmean=0.08
stanDx(bnocs)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
                        n_eff  Rhat
y>=Vent/ARDS              421 1.003
y>=In Hospital/Facility   194 1.011
y>=Home                   173 1.011
age                       199 1.009
sigmag                    385 1.008
stanDxplot(bnocs, which='ALL')
# Problematic even if binary for Home, but got slightly better with
# t(4, 0, 1) prior for sigma SD
# adding conc=0.5 rsdmean made things significanty worse
# b <- blrm(status ~ age + cluster(id), data=d, iter=1000, refresh=100)
# stanDx(b); stanDxplot(b, which='ALL')

bnocs

Bayesian Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.455 for Intercepts

 blrm(formula = status ~ age + cluster(id), data = d, psigma = 1, 
     rsdsd = 0.075, refresh = 100, loo = FALSE, file = "bnocs.rds")
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  196                 3343                11128 
                 Home 
                19586 
 
Frequencies of Missing Values Due to Each Variable
      status         age cluster(id) 
          50           0           0 
 
Mixed Calibration/
Discrimination Indexes
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 34253 B 0.006 [0.006, 0.006] g 0.742 [0.574, 0.952] C 0.582 [0.582, 0.582]
Draws 4000 gp 0 [0, 0] Dxy 0.163 [0.163, 0.163]
Chains 4 EV 0 [0, 0]
p 1 v 0.435 [0.252, 0.7]
Cluster on id vp 0 [0, 0]
Clusters 1351
σγ 3.2259 [3.0911, 3.3747]
Mean β Median β S.E. Lower Upper Pr(β>0) Symmetry
y≥Vent/ARDS  11.4263  11.4129  0.3425  10.7745  12.1094  1.0000  1.07
y≥In Hospital/Facility   6.3262   6.3142  0.3260   5.6944   6.9784  1.0000  1.04
y≥Home   1.9749   1.9653  0.3233   1.3710   2.6387  1.0000  1.03
age  -0.0410  -0.0408  0.0057  -0.0533  -0.0313  0.0000  0.95
bnocarry <- blrm(status ~ bstatus + rcs(day, 5) + rcs(age, 3) +
                 rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + treatment +
                         cluster(id), data=d, loo=FALSE, refresh=50, file='bnocarry.rds')
bnocarry

Bayesian Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.455 for Intercepts

 blrm(formula = status ~ bstatus + rcs(day, 5) + rcs(age, 3) + 
     rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + treatment + 
     cluster(id), data = d, refresh = 50, loo = FALSE, file = "bnocarry.rds")
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  189                 3169                10862 
                 Home 
                18988 
 
Frequencies of Missing Values Due to Each Variable
      status     bstatus         day         age        lips    charlson 
          50          28           0           0           0        1045 
        sofa   treatment cluster(id) 
           0           0           0 
 
Mixed Calibration/
Discrimination Indexes
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 33208 B 0.006 [0.006, 0.006] g 4.077 [3.892, 4.272] C 0.773 [0.77, 0.776]
Draws 4000 gp 0 [0, 0.001] Dxy 0.546 [0.54, 0.553]
Chains 4 EV 0.098 [0.015, 0.183]
p 17 v 13.183 [11.97, 14.401]
Cluster on id vp 0 [0, 0]
Clusters 1309
σγ 4.1683 [3.9875, 4.3536]
Mean β Median β S.E. Lower Upper Pr(β>0) Symmetry
y≥Vent/ARDS   8.2537   8.2419  0.9665   6.5110  10.2707  1.0000  1.01
y≥In Hospital/Facility  -0.2323  -0.2381  0.9608  -2.0608   1.6383  0.4040  1.02
y≥Home  -7.7042  -7.7114  0.9637  -9.6311  -5.9360  0.0000  1.00
bstatus=On Vent   0.9488   0.9359  0.5272  -0.0824   1.9521  0.9635  1.00
bstatus=In Hospital   4.7748   4.7693  0.5249   3.7904   5.8431  1.0000  1.07
day   0.6975   0.6970  0.0169   0.6649   0.7316  1.0000  0.98
day’  -1.1754  -1.1746  0.1039  -1.3757  -0.9721  0.0000  1.01
day’’   1.8060   1.8027  0.3214   1.1876   2.4208  1.0000  1.00
day’’’  -0.0027   0.0056  0.4316  -0.8599   0.8159  0.5042  1.00
age  -0.0003   0.0000  0.0186  -0.0371   0.0366  0.4990  1.03
age’  -0.0287  -0.0291  0.0166  -0.0595   0.0053  0.0423  0.98
lips  -0.0262  -0.0269  0.1091  -0.2349   0.1944  0.3985  1.07
lips’  -0.0787  -0.0782  0.1009  -0.2673   0.1185  0.2272  0.97
charlson  -0.4306  -0.4217  0.2920  -1.0024   0.1340  0.0685  0.99
charlson’   1.5518   1.5217  1.4831  -1.5054   4.4033  0.8562  1.06
charlson’’  -3.6542  -3.5987  3.4116  -9.9032   3.5933  0.1395  0.94
sofa   0.0069   0.0120  0.1872  -0.3533   0.3709  0.5212  0.89
sofa’  -0.3965  -0.4190  0.7343  -1.8090   1.0533  0.2828  1.11
sofa’’   0.3921   0.4177  1.8166  -3.0990   4.0762  0.5980  0.90
treatment=Vitamin D   0.4395   0.4486  0.2436  -0.0398   0.9269  0.9575  0.98
a <- anova(bnocarry)
a
Relative Explained Variation for status. Approximate total model Wald χ2 used in denominators of REV:6411.9 [6129.9, 6754.7].
REV Lower Upper d.f.
bstatus 0.029 0.023 0.039 2
day 0.977 0.968 0.983 4
Nonlinear 0.246 0.226 0.265 3
age 0.002 0.000 0.004 2
Nonlinear 0.000 0.000 0.002 1
lips 0.001 0.000 0.003 2
Nonlinear 0.000 0.000 0.001 1
charlson 0.002 0.000 0.005 3
Nonlinear 0.000 0.000 0.001 2
sofa 0.007 0.004 0.011 3
Nonlinear 0.001 0.000 0.003 2
treatment 0.001 0.000 0.002 1
TOTAL NONLINEAR 0.248 0.229 0.269 9
TOTAL 1.000 1.000 1.000 17
stanDx(bnocarry)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
                        n_eff  Rhat
y>=Vent/ARDS              336 1.008
y>=In Hospital/Facility   138 1.026
y>=Home                   101 1.038
bstatus=On Vent           110 1.042
bstatus=In Hospital       159 1.010
day                      1357 1.000
day'                     2714 1.001
day''                    3691 1.001
day'''                   4264 1.000
age                       141 1.032
age'                      163 1.022
lips                      121 1.038
lips'                      52 1.076
charlson                  126 1.027
charlson'                 141 1.015
charlson''                122 1.028
sofa                      143 1.024
sofa'                     140 1.024
sofa''                    159 1.017
treatment=Vitamin D       130 1.039
sigmag                    264 1.005
stanDxplot(bnocarry)
ggplot(Predict(bnocarry))

plot(a)

Compute the posterior probability that treatment raises the chance of a patient having a given outcome or better (this is already in the coefficient table above).

P <- PostF(bnocarry, name='orig')
P(`treatment=Vitamin D` > 0)
[1] 0.9575

For comparison, fit a similar Bayesian model adjusting only for time, and one not adjusting for any covariates.

bnocarry2 <- blrm(status ~ rcs(day, 5) + treatment +
                  cluster(id), data=d, loo=FALSE, refresh=50, file='bnocarry2.rds')
bnocarry2

Bayesian Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.455 for Intercepts

 blrm(formula = status ~ rcs(day, 5) + treatment + cluster(id), 
     data = d, refresh = 50, loo = FALSE, file = "bnocarry2.rds")
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  196                 3343                11128 
                 Home 
                19586 
 
Frequencies of Missing Values Due to Each Variable
      status         day   treatment cluster(id) 
          50           0           0           0 
 
Mixed Calibration/
Discrimination Indexes
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 34253 B 0.006 [0.006, 0.006] g 2.627 [2.547, 2.691] C 0.711 [0.71, 0.711]
Draws 4000 gp 0 [0, 0] Dxy 0.422 [0.42, 0.423]
Chains 4 EV 0 [0, 0]
p 5 v 5.784 [5.503, 6.074]
Cluster on id vp 0 [0, 0]
Clusters 1351
σγ 5.0107 [4.7918, 5.2634]
Mean β Median β S.E. Lower Upper Pr(β>0) Symmetry
y≥Vent/ARDS   9.2229   9.2242  0.2474   8.7477   9.7103  1.0000  0.99
y≥In Hospital/Facility   0.7985   0.7963  0.2049   0.4063   1.1985  1.0000  0.94
y≥Home  -6.5560  -6.5525  0.2136  -6.9970  -6.1586  0.0000  0.93
day   0.6986   0.6988  0.0163   0.6660   0.7290  1.0000  1.02
day’  -1.2033  -1.2027  0.1006  -1.4030  -1.0111  0.0000  0.96
day’’   1.8840   1.8821  0.3113   1.2824   2.4896  1.0000  1.02
day’’’  -0.0628  -0.0655  0.4169  -0.8835   0.7454  0.4395  1.02
treatment=Vitamin D   0.2469   0.2403  0.2630  -0.3019   0.7672  0.8315  1.11
bnocarry3 <- blrm(status ~ treatment + cluster(id),
                  data=d, loo=FALSE, refresh=50, file='bnocarry3.rds')
bnocarry3

Bayesian Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.455 for Intercepts

 blrm(formula = status ~ treatment + cluster(id), data = d, refresh = 50, 
     loo = FALSE, file = "bnocarry3.rds")
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  196                 3343                11128 
                 Home 
                19586 
 
Frequencies of Missing Values Due to Each Variable
      status   treatment cluster(id) 
          50           0           0 
 
Mixed Calibration/
Discrimination Indexes
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 34253 B 0.006 [0.006, 0.006] g 0.088 [0, 0.226] C 0.511 [0.473, 0.527]
Draws 4000 gp 0 [0, 0] Dxy 0.023 [-0.053, 0.053]
Chains 4 EV 0 [0, 0]
p 1 v 0.012 [0, 0.051]
Cluster on id vp 0 [0, 0]
Clusters 1351
σγ 3.2963 [3.1478, 3.4361]
Mean β Median β S.E. Lower Upper Pr(β>0) Symmetry
y≥Vent/ARDS   9.0687   9.0673  0.1694   8.7362   9.3921  1.0000  1.00
y≥In Hospital/Facility   3.9605   3.9591  0.1361   3.7138   4.2406  1.0000  1.06
y≥Home  -0.3899  -0.3909  0.1324  -0.6394  -0.1336  0.0008  0.99
treatment=Vitamin D   0.1152   0.1155  0.1948  -0.2774   0.4813  0.7225  0.94

Adjusting only for time and not other baseline covariates, the between-patient outcome heterogeneity is greater as makes sense, and the treatment effect remains diminished. When not even adjusting for time, the value of \(\sigma_\gamma\) ironically becomes lower than from the model with full covariate adjustment, and the treatment effect becomes very small.

Bayesian PO Random Effects Model With Time by Treatment Interaction

We go back to adjusting for the full set of covariates, and allow time to nonlinearly interact with treatment.

btxtime <- blrm(status ~ bstatus + treatment * rcs(day, 4) + rcs(age, 3) +
                 rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) +
                         cluster(id), data=d, loo=FALSE, refresh=50, file='btxtime.rds')
stanDx(btxtime)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
                            n_eff  Rhat
y>=Vent/ARDS                  309 1.013
y>=In Hospital/Facility       149 1.022
y>=Home                       120 1.023
bstatus=On Vent               142 1.027
bstatus=In Hospital           143 1.020
treatment=Vitamin D           154 1.011
day                          1979 1.005
day'                         3281 1.000
day''                        3943 1.001
age                           131 1.026
age'                          119 1.031
lips                           67 1.064
lips'                         149 1.012
charlson                      144 1.038
charlson'                     155 1.033
charlson''                     79 1.066
sofa                          168 1.017
sofa'                         164 1.011
sofa''                        104 1.053
treatment=Vitamin D * day    3919 1.000
treatment=Vitamin D * day'   3880 1.000
treatment=Vitamin D * day''  4018 1.000
sigmag                        298 1.010
stanDxplot(btxtime)

btxtime

Bayesian Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.455 for Intercepts

 blrm(formula = status ~ bstatus + treatment * rcs(day, 4) + rcs(age, 
     3) + rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + cluster(id), 
     data = d, refresh = 50, loo = FALSE, file = "btxtime.rds")
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  189                 3169                10862 
                 Home 
                18988 
 
Frequencies of Missing Values Due to Each Variable
      status     bstatus   treatment         day         age        lips 
          50          28           0           0           0           0 
    charlson        sofa cluster(id) 
        1045           0           0 
 
Mixed Calibration/
Discrimination Indexes
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 33208 B 0.006 [0.006, 0.006] g 4.048 [3.838, 4.234] C 0.777 [0.773, 0.78]
Draws 4000 gp 0 [0, 0.001] Dxy 0.553 [0.546, 0.559]
Chains 4 EV 0.093 [0.017, 0.182]
p 19 v 12.993 [11.63, 14.157]
Cluster on id vp 0 [0, 0]
Clusters 1309
σγ 4.1813 [4.005, 4.3861]
Mean β Median β S.E. Lower Upper Pr(β>0) Symmetry
y≥Vent/ARDS   8.3971   8.3957  1.0256   6.3030  10.3725  1.0000  1.08
y≥In Hospital/Facility  -0.0842  -0.0892  1.0181   -2.2243   1.8647  0.4602  1.06
y≥Home  -7.5591  -7.5620  1.0209   -9.6862  -5.6048  0.0000  1.06
bstatus=On Vent   0.9108   0.9083  0.5018   -0.0981   1.9025  0.9618  0.99
bstatus=In Hospital   4.6828   4.6871  0.5175   3.6220   5.6637  1.0000  0.97
treatment=Vitamin D   0.3346   0.3394  0.2428   -0.1250   0.8277  0.9130  0.95
day   0.6643   0.6642  0.0173   0.6308   0.6991  1.0000  1.00
day’  -0.8991  -0.8994  0.0566   -1.0131  -0.7927  0.0000  1.02
day’’   1.8535   1.8551  0.1629   1.5388   2.1808  1.0000  0.99
age  -0.0009  -0.0008  0.0205   -0.0401   0.0398  0.4825  1.07
age’  -0.0293  -0.0294  0.0183   -0.0636   0.0076  0.0530  0.97
lips  -0.0291  -0.0297  0.1142   -0.2522   0.2005  0.3952  0.99
lips’  -0.0799  -0.0807  0.1080   -0.2896   0.1390  0.2225  0.99
charlson  -0.4172  -0.4136  0.3004   -1.0080   0.1831  0.0795  0.95
charlson’   1.5760   1.5553  1.5598   -1.3238   5.0165  0.8522  1.04
charlson’’  -3.7606  -3.6762  3.5813  -11.9324   2.6047  0.1363  0.97
sofa   0.0327   0.0251  0.1928   -0.3203   0.4309  0.5572  1.06
sofa’  -0.4858  -0.4620  0.7517   -1.9843   0.8965  0.2780  0.96
sofa’’   0.6131   0.5621  1.8695   -2.8167   4.3160  0.6150  1.05
treatment=Vitamin D × day   0.0194   0.0194  0.0227   -0.0254   0.0643  0.8112  0.98
treatment=Vitamin D × day’  -0.0317  -0.0311  0.0790   -0.1862   0.1207  0.3440  0.99
treatment=Vitamin D × day’’   0.0100   0.0070  0.2293   -0.4326   0.4518  0.5112  1.03

Compute treatment contrasts (log odds ratios) as a function of time.

k <- contrast(btxtime, list(treatment='Vitamin D', day=1:28),
                       list(treatment='Placebo',   day=1:28))
k
         bstatus day age lips charlson sofa  Contrast      S.E.        Lower
 1   In Hospital   1  58    5        3    5 0.3540591 0.2338975 -0.129725199
 2   In Hospital   2  58    5        3    5 0.3734883 0.2269386 -0.077890340
 3   In Hospital   3  58    5        3    5 0.3928668 0.2221128 -0.066133606
 4*  In Hospital   4  58    5        3    5 0.4119408 0.2195306 -0.030094376
 5*  In Hospital   5  58    5        3    5 0.4304057 0.2190076 -0.006784936
 6*  In Hospital   6  58    5        3    5 0.4479572 0.2200671 -0.002051644
 7*  In Hospital   7  58    5        3    5 0.4642907 0.2220373  0.006342900
 8*  In Hospital   8  58    5        3    5 0.4791018 0.2241803  0.021734510
 9*  In Hospital   9  58    5        3    5 0.4920858 0.2258263  0.022204824
10*  In Hospital  10  58    5        3    5 0.5029385 0.2265063  0.069770881
11   In Hospital  11  58    5        3    5 0.5113711 0.2261697  0.081901485
12*  In Hospital  12  58    5        3    5 0.5171591 0.2254095  0.079327087
13*  In Hospital  13  58    5        3    5 0.5200937 0.2249417  0.092832635
14*  In Hospital  14  58    5        3    5 0.5199662 0.2252829  0.092554319
15*  In Hospital  15  58    5        3    5 0.5165680 0.2265961  0.075674492
16*  In Hospital  16  58    5        3    5 0.5096903 0.2286418  0.079243684
17*  In Hospital  17  58    5        3    5 0.4991245 0.2308418  0.057896195
18*  In Hospital  18  58    5        3    5 0.4846618 0.2324388  0.034690217
19*  In Hospital  19  58    5        3    5 0.4662043 0.2328201 -0.022454378
20*  In Hospital  20  58    5        3    5 0.4440974 0.2320473 -0.048543807
21*  In Hospital  21  58    5        3    5 0.4187972 0.2306030 -0.065062469
22*  In Hospital  22  58    5        3    5 0.3907600 0.2291814 -0.083756952
23*  In Hospital  23  58    5        3    5 0.3604420 0.2285721 -0.093189231
24*  In Hospital  24  58    5        3    5 0.3282992 0.2295390 -0.131587359
25*  In Hospital  25  58    5        3    5 0.2947879 0.2326931 -0.151226537
26*  In Hospital  26  58    5        3    5 0.2603643 0.2383827 -0.212094615
27*  In Hospital  27  58    5        3    5 0.2254845 0.2466329 -0.251975767
28*  In Hospital  28  58    5        3    5 0.1905286 0.2571906 -0.295994535
         Upper Pr(Contrast>0)
 1   0.7925092         0.9318
 2   0.8223911         0.9455
 3   0.8172218         0.9552
 4*  0.8372061         0.9630
 5*  0.8594823         0.9698
 6*  0.8690354         0.9750
 7*  0.8790902         0.9782
 8*  0.9041111         0.9815
 9*  0.9084629         0.9842
10*  0.9594916         0.9860
11   0.9675514         0.9880
12*  0.9614458         0.9898
13*  0.9749873         0.9895
14*  0.9733888         0.9888
15*  0.9639879         0.9870
16*  0.9742668         0.9835
17*  0.9649165         0.9802
18*  0.9499455         0.9778
19*  0.8948266         0.9712
20*  0.8629274         0.9658
21*  0.8381119         0.9600
22*  0.8120379         0.9490
23*  0.7995944         0.9382
24*  0.7616194         0.9175
25*  0.7460600         0.8935
26*  0.7064714         0.8582
27*  0.7005702         0.8185
28*  0.6997217         0.7748

Redundant contrasts are denoted by *

Intervals are 0.95 highest posterior density intervals
Contrast is the posterior mean 

There is a high probability of a treatment effect from day 1 - day 24.

Bayesian Partial Proportional Odds Random Effects Model

Again analyze the “death not carried forward” dataset. Now we relax the proportional hazards assumption with regard to time and treatment. We use the Peterson and Harrell (1990) partial proportional odds model (PPO) which is akin to a Cox model with time-dependent covariates. In general in a longitudinal study one might expect time to not satisfy the PO assumption because the mix of events often changes over time. The departure from PO for time is assumed to be linear whereas the main effect of time uses a 5-knot restricted cubic spline function.

# Check against vgam when clustering is ignored
# Good agreement
if(FALSE) {
  require(VGAM)
  st <- ordered(d$status)
  f <- vgam(st ~ age + treatment, data=d,
            cumulative(reverse=TRUE, parallel=FALSE))
  print(coef(f))
  b <- blrm(status ~ age + treatment, ~ age + treatment, data=d, refresh=100)
  b
}

# Check variance of random effects against mixor
# Don't bother.  Program would not finish even after 2 hours
# require(mixor)
# d$st <- ordered(d$status)
# f <- mixor(st ~ age + day + treatment, id=id, link='logit', data=d)
#f
# sqrt(diag(vcov(f)))
# 

bncppo <- blrm(status ~ bstatus + rcs(age, 4) + rcs(day, 5) +
                        treatment + cluster(id),
                        ~ day + treatment,  # PPO part of model
                                         priorsdppo=0.5, seed=5,
                     data=d, loo=FALSE, iter=3000, refresh=100,
                                         ppairs='bncppo-pairs.png', file='bncppo.rds')

stanDx(bncppo)
Iterations: 3000 on each of 4 chains, with 6000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
                                            n_eff  Rhat
y>=Vent/ARDS                                  480 1.005
y>=In Hospital/Facility                       213 1.021
y>=Home                                       166 1.028
bstatus=On Vent                               226 1.007
bstatus=In Hospital                           256 1.019
age                                           149 1.041
age'                                          225 1.030
age''                                         230 1.017
day                                          3887 1.000
day'                                         6230 1.000
day''                                        7603 1.000
day'''                                       8151 1.000
treatment=Vitamin D                           166 1.024
day:y>=In Hospital/Facility                   367 1.010
treatment=Vitamin D:y>=In Hospital/Facility  8989 1.000
day:y>=Home                                  7463 1.000
treatment=Vitamin D:y>=Home                  9828 1.000
sigmag                                       7349 1.000
stanDxplot(bncppo)

# Tx effects
# Default priors for ppo: -0.1338 0.4077 0.5113 prob 0.35 0.83 0.91
# sd=0.75                 -0.0565 0.3334 0.4256 prob 0.44 0.84 0.91
# sd=0.5                  -0.0109 0.2714 0.3550 prob 0.48 0.85 0.93
# sd=0.5 t prior          -0.0267 0.2783 0.3599 prob 0.46 0.82 0.90 s=4.61
# sd=0.3                   0.0933 0.1746 0.2459 prob 0.63 0.84 0.90
# Above results for t prior no better convergence than with exp

knitr::include_graphics('bncppo-pairs.png')

bncppo

Bayesian Partial Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.455 for Intercepts

 blrm(formula = status ~ bstatus + rcs(age, 4) + rcs(day, 5) + 
     treatment + cluster(id), ppo = ~day + treatment, data = d, 
     priorsdppo = 0.5, iter = 3000, refresh = 100, loo = FALSE, 
     ppairs = "bncppo-pairs.png", file = "bncppo.rds", seed = 5)
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  196                 3343                11128 
                 Home 
                19586 
 
Frequencies of Missing Values Due to Each Variable
      status     bstatus         age         day   treatment cluster(id) 
          50          28           0           0           0           0 
 
Mixed Calibration/
Discrimination Indexes
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 34253 B 0.006 [0.006, 0.006] g 3.901 [3.723, 4.125] C 0.769 [0.767, 0.772]
Draws 6000 gp 0 [0, 0] Dxy 0.538 [0.533, 0.543]
Chains 4 EV 0.004 [0.001, 0.011]
p 10 v 11.986 [10.925, 13.398]
Cluster on id vp 0 [0, 0]
Clusters 1351
σγ 4.2786 [4.0909, 4.4736]
Mean β Median β S.E. Lower Upper Pr(β>0) Symmetry
y≥Vent/ARDS   6.3470   6.3430  1.1994   3.8172   8.6190  1.0000  0.96
y≥In Hospital/Facility  -2.1308  -2.1279  1.1978   -4.4000   0.4173  0.0370  0.96
y≥Home  -9.4880  -9.4819  1.2005  -11.8953  -7.1136  0.0000  0.95
bstatus=On Vent   1.6367   1.6440  0.4829   0.6848   2.5995  0.9997  0.94
bstatus=In Hospital   6.0539   6.0544  0.4562   5.1634   6.9694  1.0000  0.97
age  -0.0099  -0.0097  0.0307   -0.0734   0.0488  0.3675  1.02
age’  -0.0968  -0.0981  0.0705   -0.2268   0.0548  0.0868  1.00
age’’   0.4633   0.4725  0.3518   -0.2844   1.1253  0.9033  0.99
day   0.7006   0.7006  0.0169   0.6686   0.7342  1.0000  1.03
day’  -1.2157  -1.2152  0.1052   -1.4126  -1.0030  0.0000  1.03
day’’   1.9167   1.9182  0.3240   1.2849   2.5505  1.0000  0.99
day’’’  -0.0939  -0.0946  0.4313   -0.9805   0.7162  0.4085  1.04
treatment=Vitamin D   0.2586   0.2582  0.2553   -0.2296   0.7553  0.8388  1.01
day:y≥In Hospital/Facility  -0.0002  -0.0002  0.0003   -0.0009   0.0004  0.2255  0.98
treatment=Vitamin D:y≥In Hospital/Facility   0.0001   0.0002  0.0053   -0.0103   0.0104  0.5102  1.02
day:y≥Home   0.0004   0.0004  0.0003   -0.0002   0.0011  0.8930  1.01
treatment=Vitamin D:y≥Home   0.0002   0.0002  0.0054   -0.0104   0.0109  0.5125  1.01

Plot posterior density and Compute posterior mean, median, and 0.95 highest posterior density interval for the treatment effect on each category of the response or better.

P <- PostF(bncppo, name='orig')
P(`treatment=Vitamin D` > 0)
[1] 0.8388333
plot(P(`treatment=Vitamin D`))
P(`treatment=Vitamin D` + `treatment=Vitamin D:y>=In Hospital/Facility` > 0)
[1] 0.8385
plot(P(`treatment=Vitamin D` + `treatment=Vitamin D:y>=In Hospital/Facility`))
P(`treatment=Vitamin D` + `treatment=Vitamin D:y>=Home` > 0)
[1] 0.839
plot(P(`treatment=Vitamin D` + `treatment=Vitamin D:y>=Home`))

Bayes Constrained Partial Proportional Odds Model

The unconstrained partial PO model above allows the treatment effect to be different for every level of the outcome. Here we use the constrained partial PO model to allow treatment to have a special effect on death but to have the same effect on all other endpoints in terms of odds computed on cumulative probabilities. This will allow us to compute the posterior probability that the treatment affects mortality differently than it affects the other outcomes. Unlike above, we do not at the point place skepticism on the differential effect for death. No interaction with time is included in the following model.

The constrained partial PO model’s constraint function cppo never evaluates the departure from PO at the first level of the outcome variable. That is because the probability that Y is the lowest outcome is one minus the probability that it is at or above the second level. So to model a special effect of treatment on death we need to reverse the order of levels in Y.

drev <- transform(d,
                  status = factor(status, rev(levels(status))))
bdcppor <- blrm(status ~ bstatus + rcs(age, 4) + rcs(day, 5) +
                        treatment + cluster(id),
                        ~ day + treatment,  # PPO part of model
                     cppo=function(y) y == 'Dead',
                                         seed=17,
                     data=drev, loo=FALSE, iter=2000, refresh=100,
                                         file='bdcppor.rds')
stanDx(bdcppor)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
                           n_eff  Rhat
y>=In Hospital/Facility       97 1.062
y>=Vent/ARDS                 150 1.044
y>=Dead                      282 1.018
bstatus=On Vent               69 1.081
bstatus=In Hospital          103 1.019
age                          127 1.024
age'                         127 1.031
age''                        107 1.038
day                         2140 1.000
day'                        3439 0.999
day''                       4284 1.001
day'''                      4012 1.000
treatment=Vitamin D           90 1.032
day x f(y)                   234 1.009
treatment=Vitamin D x f(y)  2532 1.000
sigmag                      2646 1.000
stanDxplot(bdcppor)

bdcppor

Bayesian Constrained Partial Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.455 for Intercepts

 blrm(formula = status ~ bstatus + rcs(age, 4) + rcs(day, 5) + 
     treatment + cluster(id), ppo = ~day + treatment, cppo = function(y) y == 
     "Dead", data = drev, iter = 2000, refresh = 100, loo = FALSE, 
     file = "bdcppor.rds", seed = 17)
 

Frequencies of Responses

                 Home In Hospital/Facility            Vent/ARDS 
                19586                11128                 3343 
                 Dead 
                  196 
 
Frequencies of Missing Values Due to Each Variable
      status     bstatus         age         day   treatment cluster(id) 
          50          28           0           0           0           0 
 
Mixed Calibration/
Discrimination Indexes
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 34253 B 0.235 [0.226, 0.242] g 4.223 [4.008, 4.456] C 0.773 [0.77, 0.776]
Draws 4000 gp 0.454 [0.446, 0.46] Dxy 0.547 [0.54, 0.551]
Chains 4 EV 0.672 [0.634, 0.696]
p 10 v 14.068 [12.632, 15.621]
Cluster on id vp 0.168 [0.159, 0.174]
Clusters 1351
σγ 4.6924 [4.4862, 4.9008]
Mean β Median β S.E. Lower Upper Pr(β>0) Symmetry
y≥In Hospital/Facility  10.6406  10.6218  1.1681   8.3114  12.9286  1.0000  1.01
y≥Vent/ARDS   2.8439   2.8270  1.1679   0.6342   5.2562  0.9942  0.99
y≥Dead  -8.4408  -8.4403  1.1841  -10.7170  -6.0848  0.0000  0.97
bstatus=On Vent  -1.9087  -1.9044  0.5141   -2.8471  -0.8882  0.0000  1.00
bstatus=In Hospital  -6.6774  -6.6767  0.5129   -7.6070  -5.6711  0.0000  1.03
age   0.0090   0.0091  0.0285   -0.0467   0.0623  0.6205  1.01
age’   0.1081   0.1062  0.0682   -0.0290   0.2389  0.9412  1.01
age’’  -0.5216  -0.5209  0.3709   -1.2376   0.1963  0.0792  1.00
day  -0.7769  -0.7769  0.0180   -0.8105  -0.7395  0.0000  0.95
day’   1.4323   1.4322  0.1080   1.2128   1.6395  1.0000  0.98
day’’  -2.3983  -2.4030  0.3309   -3.0452  -1.7413  0.0000  1.03
day’’’   0.4463   0.4604  0.4394   -0.3979   1.3320  0.8490  0.99
treatment=Vitamin D  -0.3062  -0.3047  0.2607   -0.8068   0.1824  0.1232  1.00
day x f(y)   0.0232   0.0232  0.0008   0.0217   0.0247  1.0000  0.94
treatment=Vitamin D x f(y)   0.0323   0.0324  0.0176   -0.0008   0.0677  0.9702  1.02

The probability that vitamin D affects death differently than how it affects the non-fatal endpoints is 0.97. Let’s estimate the treatment effect separately for nonfatal and fatal outcomes, in that order.

contrast(bdcppor, list(treatment='Vitamin D'),
                  list(treatment='Placebo'),
         ycut='In Hospital/Facility')
                                  bstatus age  day  Contrast      S.E.
1 status=In Hospital/Facility In Hospital  58 14.5 -0.308613 0.2607144
                                   Lower     Upper Pr(Contrast>0)
1 status=In Hospital/Facility -0.8151987 0.1747911         0.1212

Intervals are 0.95 highest posterior density intervals
Contrast is the posterior mean 
# Note: ycut='Vent/ARDS' would have given the same result as non-PO did
# not kick in until the last outcome level

contrast(bdcppor, list(treatment='Vitamin D'),
                  list(treatment='Placebo'),
         ycut='Dead')
                  bstatus age  day  Contrast      S.E.     Lower     Upper
1 status=Dead In Hospital  58 14.5 0.1194458 0.3425427 -0.558436 0.7623403
              Pr(Contrast>0)
1 status=Dead         0.6368

Intervals are 0.95 highest posterior density intervals
Contrast is the posterior mean 

The probability of benefit of vitamin D on nonfatal outcomes is 1 - 0.12 = 0.88. From the same model the probability of increased mortality is 0.64.

Bayes Binary Longitudinal Model for Death Alone

In what follows we dichotomize the ordinal outcome at death vs. alive and fit a Bayesian longitudinal binary model allowing for a time x treatment interaction, after first fitting a model with no interaction. Because of a lower effective sample size with a binary event, the model is simplified.

d$dead <- 1 * (d$status == 'Dead')
bdeadnoia <- blrm(dead ~ bstatus + rcs(age, 3) + treatment +
                     cluster(id),
                                         seed=13,
                     data=d, loo=FALSE, iter=2000, refresh=100,
                                         file='bdeadnoia.rds')

stanDx(bdeadnoia)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
                    n_eff  Rhat
Intercept             888 1.005
bstatus=On Vent      2296 1.000
bstatus=In Hospital  2030 1.003
age                  2758 1.001
age'                 3543 1.000
treatment=Vitamin D  3467 1.000
sigmag                651 1.007
stanDxplot(bdeadnoia, 'ALL')

bdeadnoia

Bayesian Logistic Model

Dirichlet Priors With Concentration Parameter 0.541 for Intercepts

 blrm(formula = dead ~ bstatus + rcs(age, 3) + treatment + cluster(id), 
     data = d, iter = 2000, refresh = 100, loo = FALSE, file = "bdeadnoia.rds", 
     seed = 13)
 
Frequencies of Missing Values Due to Each Variable
        dead     bstatus         age   treatment cluster(id) 
          50          28           0           0           0 
 
Mixed Calibration/
Discrimination Indexes
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 34253 B 0.006 [0.006, 0.006] g 1.077 [0.821, 1.363] C 0.671 [0.663, 0.676]
0 34057 gp 0.002 [0.001, 0.003] Dxy 0.342 [0.326, 0.351]
1 196 EV 0.003 [0.001, 0.008]
Draws 4000 v 0.948 [0.529, 1.465]
Chains 4 vp 0 [0, 0]
p 5
Cluster on id
Clusters 1351
σγ 2.4389 [1.9401, 2.9362]
Mean β Median β S.E. Lower Upper Pr(β>0) Symmetry
Intercept  -7.9477  -7.9004  1.0097  -10.1054  -6.0666  0.0000  0.87
bstatus=On Vent  -1.1942  -1.1887  0.4094   -1.9915  -0.3906  0.0018  0.96
bstatus=In Hospital  -2.1443  -2.1351  0.3984   -2.9104  -1.3792  0.0000  0.95
age   0.0471   0.0467  0.0192   0.0099   0.0848  0.9945  1.10
age’  -0.0051  -0.0052  0.0175   -0.0378   0.0300  0.3902  1.03
treatment=Vitamin D   0.2355   0.2314  0.2393   -0.2283   0.6927  0.8360  1.03

The posterior probability that vitamin D increases mortality is 0.836 with this binary logistic model.

Now with the time interaction included.

d$dead <- 1 * (d$status == 'Dead')
bdead <- blrm(dead ~ bstatus + rcs(age, 3) + treatment * rcs(day, 3) +
                     cluster(id),
                                         seed=13,
                     data=d, loo=FALSE, iter=2000, refresh=100,
                                         file='bdead.rds')

stanDx(bdead)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
                           n_eff  Rhat
Intercept                     21 1.138
bstatus=On Vent              244 1.026
bstatus=In Hospital           71 1.065
age                          204 1.034
age'                        2239 1.000
treatment=Vitamin D          801 1.005
day                           40 1.067
day'                         326 1.021
treatment=Vitamin D * day   1272 1.002
treatment=Vitamin D * day'  1737 1.001
sigmag                        16 1.155
stanDxplot(bdead, 'ALL')

bdead

Bayesian Logistic Model

Dirichlet Priors With Concentration Parameter 0.541 for Intercepts

 blrm(formula = dead ~ bstatus + rcs(age, 3) + treatment * rcs(day, 
     3) + cluster(id), data = d, iter = 2000, refresh = 100, loo = FALSE, 
     file = "bdead.rds", seed = 13)
 
Frequencies of Missing Values Due to Each Variable
        dead     bstatus         age   treatment         day cluster(id) 
          50          28           0           0           0           0 
 
Mixed Calibration/
Discrimination Indexes
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 34253 B 0.006 [0.006, 0.006] g 0.978 [0.721, 1.135] C 0.708 [0.691, 0.717]
0 34057 gp 0.004 [0.002, 0.006] Dxy 0.415 [0.382, 0.434]
1 196 EV 0.005 [0.002, 0.008]
Draws 4000 v 0.763 [0.419, 1.028]
Chains 4 vp 0 [0, 0]
p 9
Cluster on id
Clusters 1351
σγ 1.0546 [0.0035, 1.9368]
Mean β Median β S.E. Lower Upper Pr(β>0) Symmetry
Intercept  -5.7168  -5.6153  0.9047  -7.5299  -4.0436  0.0000  0.75
bstatus=On Vent  -0.8715  -0.8626  0.2965  -1.4289  -0.3046  0.0003  0.94
bstatus=In Hospital  -1.5011  -1.4684  0.3483  -2.2361  -0.8965  0.0000  0.78
age   0.0353   0.0346  0.0139   0.0099   0.0644  0.9960  1.15
age’  -0.0052  -0.0054  0.0121  -0.0289   0.0184  0.3295  0.97
treatment=Vitamin D  -0.0892  -0.0889  0.3484  -0.7412   0.6013  0.4008  1.04
day  -0.0795  -0.0818  0.0408  -0.1624  -0.0030  0.0345  1.11
day’   0.0206   0.0214  0.0551  -0.0859   0.1308  0.6580  0.99
treatment=Vitamin D × day   0.0272   0.0267  0.0458  -0.0598   0.1167  0.7295  1.02
treatment=Vitamin D × day’  -0.0059  -0.0050  0.0693  -0.1387   0.1305  0.4722  0.97
ggplot(Predict(bdead, day, treatment))

Compute treatment contrasts (log odds ratios) as a function of time.

k <- contrast(bdead, list(treatment='Vitamin D', day=1:28),
                     list(treatment='Placebo',   day=1:28))
k
         bstatus age day     Contrast      S.E.       Lower     Upper
 1   In Hospital  58   1 -0.061995077 0.3118912 -0.63051747 0.5679513
 2   In Hospital  58   2 -0.034768495 0.2781012 -0.55512603 0.5300278
 3*  In Hospital  58   3 -0.007541913 0.2481887 -0.49214380 0.4909035
 4   In Hospital  58   4  0.019673499 0.2237445 -0.39217686 0.4911184
 5*  In Hospital  58   5  0.046821894 0.2066724 -0.33140155 0.4760852
 6*  In Hospital  58   6  0.073836253 0.1982867 -0.29800301 0.4649681
 7*  In Hospital  58   7  0.100649558 0.1985063 -0.28120825 0.4895067
 8*  In Hospital  58   8  0.127194793 0.2056764 -0.27276669 0.5295627
 9*  In Hospital  58   9  0.153404938 0.2172100 -0.27716379 0.5710971
10*  In Hospital  58  10  0.179212976 0.2304379 -0.25741483 0.6293617
11*  In Hospital  58  11  0.204551890 0.2430959 -0.24757546 0.6959855
12*  In Hospital  58  12  0.229354662 0.2534465 -0.24426898 0.7420478
13*  In Hospital  58  13  0.253554273 0.2602500 -0.24087243 0.7671111
14*  In Hospital  58  14  0.277083706 0.2627400 -0.21469462 0.8072200
15*  In Hospital  58  15  0.299897351 0.2608493 -0.18826472 0.8311799
16*  In Hospital  58  16  0.322035234 0.2557943 -0.14843419 0.8565006
17*  In Hospital  58  17  0.343558787 0.2495172 -0.13938872 0.8458842
18*  In Hospital  58  18  0.364529443 0.2444885 -0.08721785 0.8703701
19*  In Hospital  58  19  0.385008635 0.2435442 -0.07064278 0.8740898
20*  In Hospital  58  20  0.405057796 0.2494226 -0.10760083 0.8651527
21*  In Hospital  58  21  0.424738359 0.2640553 -0.08014772 0.9482028
22*  In Hospital  58  22  0.444111757 0.2880380 -0.12805692 0.9852111
23*  In Hospital  58  23  0.463239424 0.3206903 -0.16041240 1.0776938
24*  In Hospital  58  24  0.482182791 0.3605736 -0.16637156 1.2367042
25*  In Hospital  58  25  0.501003292 0.4060202 -0.22888248 1.3578278
26*  In Hospital  58  26  0.519762361 0.4554347 -0.29543608 1.4792000
27*  In Hospital  58  27  0.538511190 0.5074826 -0.38170386 1.5979949
28*  In Hospital  58  28  0.557260020 0.5613802 -0.45143753 1.7464363
     Pr(Contrast>0)
 1           0.4185
 2           0.4440
 3*          0.4815
 4           0.5298
 5*          0.5862
 6*          0.6480
 7*          0.6900
 8*          0.7255
 9*          0.7615
10*          0.7805
11*          0.7972
12*          0.8140
13*          0.8390
14*          0.8578
15*          0.8835
16*          0.9015
17*          0.9200
18*          0.9365
19*          0.9448
20*          0.9500
21*          0.9510
22*          0.9412
23*          0.9275
24*          0.9085
25*          0.8948
26*          0.8718
27*          0.8548
28*          0.8393

Redundant contrasts are denoted by *

Intervals are 0.95 highest posterior density intervals
Contrast is the posterior mean 

GEE Model for Death with Vitamin D Interaction

f <- lrm(dead ~ bstatus + rcs(age, 3) + rcs(day, 3) + rcs(vitD, 3) * treatment,
                     data=d, x=TRUE, y=TRUE)
g <- robcov(f, d$id)
anova(g)
Wald Statistics for dead
χ2 d.f. P
bstatus 30.73 2 <0.0001
age 34.31 2 <0.0001
Nonlinear 0.85 1 0.3577
day 49.58 2 <0.0001
Nonlinear 1.02 1 0.3125
vitD (Factor+Higher Order Factors) 19.30 4 0.0007
All Interactions 4.86 2 0.0882
Nonlinear (Factor+Higher Order Factors) 2.17 2 0.3387
treatment (Factor+Higher Order Factors) 5.79 3 0.1221
All Interactions 4.86 2 0.0882
vitD × treatment (Factor+Higher Order Factors) 4.86 2 0.0882
Nonlinear 0.05 1 0.8157
Nonlinear Interaction : f(A,B) vs. AB 0.05 1 0.8157
TOTAL NONLINEAR 4.00 4 0.4054
TOTAL NONLINEAR + INTERACTION 12.28 5 0.0311
TOTAL 132.51 11 <0.0001

There is mild evidence for an interaction. Look at the contrasts.

k <- contrast(g, list(treatment='Vitamin D', vitD=0:65),
                 list(treatment='Placebo', vitD=0:65))
k <- as.data.frame(k[c('vitD', 'Contrast', 'Lower', 'Upper')])
ggplot(k, aes(x=vitD, y=Contrast)) + geom_point() + 
  geom_errorbar(aes(ymin=Lower, ymax=Upper), width=0)

Here higher contrasts (higher relative log odds) is bad. Vitamin D treatment seems to move mortality in the wrong direction for patients with low baseline vitamin D, if anything.

Serial Conditional Markov Ordinal Model

Here we use a first order Markov process where the PO model on a given day is conditional on the ordinal outcome on the previous day as well as on the other variables. First we need to create a 1-day lagged outcome status. Unfortunately, since vent and ARDS were pooled for the follow-up status we need to pool them for the baseline status, resulting in some loss of information. Same for in hospital vs. being at another healthcare facility.

d <- dnc
s <- d$bstatus
levels(s) <- list('Vent/ARDS'='ARDS', 'Vent/ARDS'='On Vent', 'In Hospital/Facility'='In Hospital')
table(d$bstatus, s)
             s
              Vent/ARDS In Hospital/Facility
  ARDS             2130                    0
  On Vent          9134                    0
  In Hospital         0                23011
# Now expand baseline levels to include 1 other levels occurring in follow-up
s <- factor(s, setdiff(levels(d$status), 'Dead'))
table(d$bstatus, s)
             s
              Vent/ARDS In Hospital/Facility  Home
  ARDS             2130                    0     0
  On Vent          9134                    0     0
  In Hospital         0                23011     0
d$bstatus <- s
require(data.table)
setDT(d)
d <- d[order(id, day), ]
d[, pstatus := shift(status), by=id]
d[day == 1, pstatus := bstatus]
d$pstatus <- factor(d$pstatus, setdiff(levels(d$pstatus), 'Dead'))
# View(d[, c('id', 'day', 'bstatus', 'status', 'pstatus')])

Fit a fairly saturated model that allows

  • transitions to depend on a flexible function of time
  • the effect of the previous state to change with time
  • the treatment effect to change with time
  • a weird effect where the effect of treatment may differ according to the previous outcome state
dd <- datadist(d); options(datadist='dd')
f <- lrm(status ~ rcs(day, 4) * (pstatus + treatment + rcs(age, 3) + lips + charlson + sofa) +
         pstatus * treatment,
         data=d, maxit=30)
f

Logistic Regression Model

 lrm(formula = status ~ rcs(day, 4) * (pstatus + treatment + rcs(age, 
     3) + lips + charlson + sofa) + pstatus * treatment, data = d, 
     maxit = 30)
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  189                 3169                10862 
                 Home 
                18988 
 
Frequencies of Missing Values Due to Each Variable
    status       day   pstatus treatment       age      lips  charlson      sofa 
        50         0        49         0         0         0      1045         0 
 
Model Likelihood
Ratio Test
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 33208 LR χ2 49509.96 R2 0.915 C 0.982
max |∂log L/∂β| 1×10-7 d.f. 37 g 5.850 Dxy 0.963
Pr(>χ2) <0.0001 gr 347.347 γ 0.965
gp 0.174 τa 0.537
Brier 0.014
β S.E. Wald Z Pr(>|Z|)
y≥Vent/ARDS   4.9453  0.4358 11.35 <0.0001
y≥In Hospital/Facility  -0.1497  0.4254 -0.35 0.7249
y≥Home  -6.8416  0.4313 -15.86 <0.0001
day  -0.1961  0.0883 -2.22 0.0264
day’   0.0921  0.3535 0.26 0.7945
day’’   0.5582  1.0853 0.51 0.6071
pstatus=In Hospital/Facility   5.4912  0.1630 33.70 <0.0001
pstatus=Home  16.4244  2.2372 7.34 <0.0001
treatment=Vitamin D   0.2098  0.1405 1.49 0.1353
age  -0.0019  0.0093 -0.20 0.8416
age’   0.0013  0.0093 0.14 0.8908
lips  -0.0846  0.0213 -3.98 <0.0001
charlson  -0.0777  0.0294 -2.64 0.0083
sofa  -0.0982  0.0199 -4.95 <0.0001
day × pstatus=In Hospital/Facility   0.1522  0.0286 5.31 <0.0001
day’ × pstatus=In Hospital/Facility  -0.2329  0.1137 -2.05 0.0406
day’’ × pstatus=In Hospital/Facility   0.0035  0.3484 0.01 0.9920
day × pstatus=Home  -0.1601  0.2875 -0.56 0.5777
day’ × pstatus=Home   0.7500  0.6896 1.09 0.2768
day’’ × pstatus=Home  -2.5635  1.7128 -1.50 0.1345
day × treatment=Vitamin D   0.0012  0.0264 0.05 0.9626
day’ × treatment=Vitamin D  -0.0461  0.1040 -0.44 0.6579
day’’ × treatment=Vitamin D   0.1391  0.3179 0.44 0.6616
day × age   0.0009  0.0019 0.47 0.6373
day’ × age  -0.0042  0.0076 -0.55 0.5826
day’’ × age   0.0118  0.0233 0.51 0.6127
day × age’  -0.0021  0.0019 -1.15 0.2505
day’ × age’   0.0100  0.0073 1.37 0.1706
day’’ × age’  -0.0294  0.0220 -1.33 0.1827
day × lips   0.0111  0.0043 2.55 0.0109
day’ × lips  -0.0225  0.0171 -1.32 0.1877
day’’ × lips   0.0424  0.0519 0.82 0.4141
day × charlson   0.0025  0.0058 0.42 0.6722
day’ × charlson  -0.0049  0.0230 -0.22 0.8296
day’’ × charlson   0.0177  0.0703 0.25 0.8015
day × sofa   0.0092  0.0039 2.35 0.0188
day’ × sofa  -0.0164  0.0151 -1.08 0.2786
day’’ × sofa   0.0374  0.0458 0.82 0.4142
pstatus=In Hospital/Facility × treatment=Vitamin D  -0.1186  0.1077 -1.10 0.2707
pstatus=Home × treatment=Vitamin D  -0.4551  0.3974 -1.15 0.2521
anova(f)
Wald Statistics for status
χ2 d.f. P
day (Factor+Higher Order Factors) 275.08 27 <0.0001
All Interactions 158.49 24 <0.0001
Nonlinear (Factor+Higher Order Factors) 208.08 18 <0.0001
pstatus (Factor+Higher Order Factors) 6515.10 10 <0.0001
All Interactions 119.70 8 <0.0001
treatment (Factor+Higher Order Factors) 6.99 6 0.3215
All Interactions 4.69 5 0.4553
age (Factor+Higher Order Factors) 9.89 8 0.2727
All Interactions 3.54 6 0.7381
Nonlinear (Factor+Higher Order Factors) 4.33 4 0.3638
lips (Factor+Higher Order Factors) 20.47 4 0.0004
All Interactions 11.66 3 0.0086
charlson (Factor+Higher Order Factors) 28.81 4 <0.0001
All Interactions 1.43 3 0.6982
sofa (Factor+Higher Order Factors) 41.28 4 <0.0001
All Interactions 16.47 3 0.0009
day × pstatus (Factor+Higher Order Factors) 117.94 6 <0.0001
Nonlinear 115.35 4 <0.0001
Nonlinear Interaction : f(A,B) vs. AB 115.35 4 <0.0001
day × treatment (Factor+Higher Order Factors) 2.03 3 0.5654
Nonlinear 0.20 2 0.9065
Nonlinear Interaction : f(A,B) vs. AB 0.20 2 0.9065
day × age (Factor+Higher Order Factors) 3.54 6 0.7381
Nonlinear 3.53 5 0.6195
Nonlinear Interaction : f(A,B) vs. AB 3.53 5 0.6195
f(A,B) vs. Af(B) + Bg(A) 1.88 2 0.3901
Nonlinear Interaction in day vs. Af(B) 3.28 4 0.5129
Nonlinear Interaction in age vs. Bg(A) 2.10 3 0.5512
day × lips (Factor+Higher Order Factors) 11.66 3 0.0086
Nonlinear 8.19 2 0.0166
Nonlinear Interaction : f(A,B) vs. AB 8.19 2 0.0166
day × charlson (Factor+Higher Order Factors) 1.43 3 0.6982
Nonlinear 0.09 2 0.9559
Nonlinear Interaction : f(A,B) vs. AB 0.09 2 0.9559
day × sofa (Factor+Higher Order Factors) 16.47 3 0.0009
Nonlinear 2.91 2 0.2330
Nonlinear Interaction : f(A,B) vs. AB 2.91 2 0.2330
pstatus × treatment (Factor+Higher Order Factors) 2.14 2 0.3436
TOTAL NONLINEAR 210.40 20 <0.0001
TOTAL INTERACTION 160.60 26 <0.0001
TOTAL NONLINEAR + INTERACTION 237.87 29 <0.0001
TOTAL 6776.51 37 <0.0001
ggplot(Predict(f, day, pstatus))

There was little evidence for a treatment effect at any time, and little evidence that the effect of treatment on state transitions varied with time. There is also little evidence that the treatment effect differed for different previous states.

The graph above shows that being at home is almost an absorbing state, and that the effect of previous state varies with time. Some of that may be due to some odd data: there are 57 patients on ventilator on day 27 but not on day 28.

For now let’s simplify the model by having previous state as the only factor interacting with time, and removing the interactions with treatment. Compare the variances of parameter estimates with cluster sandwich robust estimates.

f <- lrm(status ~ rcs(day, 4) * pstatus + treatment + rcs(age, 3) + lips + charlson + sofa,
         data=d, maxit=30, x=TRUE, y=TRUE)
g <- robcov(f, d$id)
round(diag(vcov(f)) / diag(vcov(g)), 2)
                        y>=Vent/ARDS              y>=In Hospital/Facility 
                                0.86                                 0.92 
                             y>=Home                                  day 
                                0.92                                 1.54 
                                day'                                day'' 
                                1.83                                 1.82 
        pstatus=In Hospital/Facility                         pstatus=Home 
                                0.95                                 0.70 
                 treatment=Vitamin D                                  age 
                                1.07                                 0.95 
                                age'                                 lips 
                                1.08                                 1.11 
                            charlson                                 sofa 
                                1.10                                 1.10 
  day * pstatus=In Hospital/Facility  day' * pstatus=In Hospital/Facility 
                                1.42                                 1.55 
day'' * pstatus=In Hospital/Facility                   day * pstatus=Home 
                                1.58                                 0.70 
                 day' * pstatus=Home                 day'' * pstatus=Home 
                                0.75                                 0.80 

The two variance estimates are similar, giving an indirect indication that the assumption of conditional independence (conditioning on previous state) is reasonable.

Repeat the model truncating the data on day 27 to see how much of the time x previous state interaction resulted from the drop in number of patients on ventilator on day 28.

fm <- lrm(status ~ rcs(day, 4) * pstatus + treatment + rcs(age, 3) + lips + charlson + sofa,
         data=d, maxit=30, subset=day < 28)
fm

Logistic Regression Model

 lrm(formula = status ~ rcs(day, 4) * pstatus + treatment + rcs(age, 
     3) + lips + charlson + sofa, data = d, subset = day < 28, 
     maxit = 30)
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  188                 3169                10643 
                 Home 
                18088 
 
Frequencies of Missing Values Due to Each Variable
    status       day   pstatus treatment       age      lips  charlson      sofa 
        48         0        47         0         0         0      1010         0 
 
Model Likelihood
Ratio Test
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 32088 LR χ2 48589.20 R2 0.918 C 0.982
max |∂log L/∂β| 8×10-6 d.f. 17 g 5.940 Dxy 0.964
Pr(>χ2) <0.0001 gr 380.041 γ 0.966
gp 0.180 τa 0.542
Brier 0.014
β S.E. Wald Z Pr(>|Z|)
y≥Vent/ARDS   4.3659  0.2183 20.00 <0.0001
y≥In Hospital/Facility  -0.7502  0.1975 -3.80 0.0001
y≥Home  -7.6241  0.2151 -35.44 <0.0001
day  -0.0990  0.0238 -4.15 <0.0001
day’   0.1121  0.1026 1.09 0.2744
day’’  -0.1678  0.2674 -0.63 0.5304
pstatus=In Hospital/Facility   5.5857  0.1560 35.80 <0.0001
pstatus=Home  16.6641  2.4397 6.83 <0.0001
treatment=Vitamin D   0.0693  0.0518 1.34 0.1804
age   0.0005  0.0038 0.14 0.8857
age’  -0.0052  0.0037 -1.44 0.1511
lips  -0.0245  0.0086 -2.85 0.0044
charlson  -0.0651  0.0116 -5.61 <0.0001
sofa  -0.0386  0.0077 -5.02 <0.0001
day × pstatus=In Hospital/Facility   0.1924  0.0300 6.41 <0.0001
day’ × pstatus=In Hospital/Facility  -0.6227  0.1272 -4.90 <0.0001
day’’ × pstatus=In Hospital/Facility   1.3921  0.3311 4.21 <0.0001
day × pstatus=Home  -0.1821  0.3272 -0.56 0.5779
day’ × pstatus=Home   0.5840  0.8252 0.71 0.4791
day’’ × pstatus=Home  -1.3104  1.7679 -0.74 0.4586
anova(fm)
Wald Statistics for status
χ2 d.f. P
day (Factor+Higher Order Factors) 183.36 9 <0.0001
All Interactions 47.89 6 <0.0001
Nonlinear (Factor+Higher Order Factors) 60.34 6 <0.0001
pstatus (Factor+Higher Order Factors) 6266.67 8 <0.0001
All Interactions 47.89 6 <0.0001
treatment 1.79 1 0.1804
age 6.05 2 0.0486
Nonlinear 2.06 1 0.1511
lips 8.11 1 0.0044
charlson 31.44 1 <0.0001
sofa 25.21 1 <0.0001
day × pstatus (Factor+Higher Order Factors) 47.89 6 <0.0001
Nonlinear 38.56 4 <0.0001
Nonlinear Interaction : f(A,B) vs. AB 38.56 4 <0.0001
TOTAL NONLINEAR 62.40 7 <0.0001
TOTAL NONLINEAR + INTERACTION 70.00 9 <0.0001
TOTAL 6448.81 17 <0.0001
ggplot(Predict(fm, day, pstatus))

The interaction effect was cut in half but is still there. However, to put its strength into context, the time by previous status interaction explains only 0.008 of the total amount of outcome variation explained by the status + status x time interaction combination.

We see that being at home is less of an absorbing state for later times, and that the influence of previously being in Vent/ARDS on the current state lessens with time.

Computing Unconditional Estimates

One can uncondition the previous model to obtain unconditional treatment effects, although these will be covariate- and time-dependent. We can use simulation to approximate the result to any desired degree of accuracy. The fast exact approach will be presented later, but the simulation method allows one to estimate complex quantities that are functions of individual conditional distributions at a specific follow-up time as well as functions of the multivariate distribution. We specify covariate settings (here we use representative values each covariate, just for convenience, using a method described below) and vary the treatment assignment. Here are the steps involved:

  1. The initial state is always conditioned on, so is not marginalized out. Separate simulations can be done for each initial state. For now we just set the initial state to in hospital, the most prevalent state.
  2. Given this initial state, use the fitted model to get the conditional distribution of Y at (post-randomization) day 1 given the initial state, covariate settings, and treatment.
  3. Simulate outcomes for day 1 for 50,000 patients, for each treatment.
  4. For each of these patients, simulate their outcome for day 2 conditional on their simulated outcome for day 1.
  5. Likewise for days 3, …, 27.
  6. Death is an absorbing state so once death occurs all future status values are also set to death.
  7. For each patient compute the condition of interest from the vector of 27 Y values. Choices include the time until the patient was able to go home, the absolute risk of having an outcome \(Y \geq y\) at day 27, and the odds of the same.
  8. For a measure computed in the last step, for each treatment, compute a summary measure such as a mean or median.
  9. Take a difference or ratio of estimates computed in the previous step to summarize the treatment effect on the parameter of interest.

This approach does not factor in uncertainties in parameter estimates, so provides only a point estimate. A Bayesian approach would easily remedy this by computing the Bayesian posterior predictive distribution, i.e., by simulating each dataset for a different posterior sample of the model parameters. In the frequentist example that follows we ignore such uncertainties and compute the probability of being at home as a function of day and treatment, vitamin D:control odds ratio for being at home on days 1,2, …, 27, and the median time until being discharged home.

Note: The probability of death by a given day (these are cumulative probabilities since we treated death as an absorbing state) is lower for Vitamin D, in disagreement with the raw data. The estimates shown make the proportional odds assumption, which assumes consistency in treatment effects across cutoffs of outcome severity. As shown earlier, there is strong evidence against such consistency.

Start with computation of a set of representative covariate values that will be used when getting estimates related to transformed parameters. This is done by fitting an ordinal model for day 10 status alone, and finding covariate settings that give predicted values that are close to the unadjusted estimates for placebo. Of the qualifying settings, one that is subjectively closer to the center of the data distribution is chosen.

# Go back to the dataset that carried death forward, and analyze day 10
f10 <- lrm(status ~ bstatus + treatment, data=dcf, subset=day==10)
priskmarg <- predict(f10, data.frame(treatment='Placebo', bstatus='In Hospital'),
                 type='fitted')['y>=Home']
priskmarg
  y>=Home 
0.6217469 
# Compare with simple estimate not assuming effect of trt & bstatus are additive in log odds
priskmargEmpirical <-
  with(subset(dcf, day == 10 & treatment == 'Placebo' & bstatus=='In Hospital'),
     mean(status == 'Home', na.rm=TRUE))
# Now add covariates
f10c <- lrm(status ~ bstatus + treatment + rcs(age, 3) + lips + charlson + sofa,
            data=dcf, subset=day==10)
w <- subset(dcf, day==10, select=c(age, lips, charlson, sofa))
w <- cbind(w, treatment='Placebo', bstatus='In Hospital')
prisk <- predict(f10c, w, type='fitted')[, 'y>=Home']
mean(prisk, na.rm=TRUE)
[1] 0.5769679
diff <- abs(prisk - priskmargEmpirical)
w$diff <- diff
i <- ! is.na(prisk) & diff < 0.002
sum(i)
[1] 14
setDT(w)
wi <- w[i, ]
wi <- wi[order(diff), ]
wi
    age lips charlson sofa treatment     bstatus         diff
 1:  55  7.0        3    5   Placebo In Hospital 6.198609e-05
 2:  35  8.5        0    8   Placebo In Hospital 9.360643e-05
 3:  31  4.0        6    4   Placebo In Hospital 1.416378e-04
 4:  62  7.0        4    3   Placebo In Hospital 2.850394e-04
 5:  68  5.5        2    5   Placebo In Hospital 3.229592e-04
 6:  26  5.0        0   10   Placebo In Hospital 4.382999e-04
 7:  66  4.0        5    3   Placebo In Hospital 4.643863e-04
 8:  78  3.0        6    0   Placebo In Hospital 8.246585e-04
 9:  29  8.0        2    6   Placebo In Hospital 8.843973e-04
10:  70  1.0        5    4   Placebo In Hospital 1.029117e-03
11:  70  8.0        4    1   Placebo In Hospital 1.511116e-03
12:  56  9.0        1    6   Placebo In Hospital 1.640537e-03
13:  41  6.0        4    5   Placebo In Hospital 1.847777e-03
14:  82  1.5        5    1   Placebo In Hospital 1.907995e-03
# Best match is close to the center
repcov <- wi[1, 1:4]
# For some reason when evaluating the result of Function(fit) with e.g. age=repcov$age
# the arithmetic was wrong.  Need to remove data.frame/data.table and label attributes
repcov <- lapply(repcov, function(x) {x <- unclass(x); attr(x, 'label') <- NULL; x})

# Compute median baseline covariates, using only the first record per patient
co <- all.vars(~ age + lips + charlson + sofa)
setDT(dcf)
m <- dcf[day == 10, sapply(.SD, function(x) list(median = median(x, na.rm=TRUE))),
       .SDcols=co]
m
   age.median lips.median charlson.median sofa.median
1:         58           5               3           5

Derive an R function that takes the comprehensive fitted model and computes the linear predictor except for the proportional odds model intercepts. Check the function with some examples.

linpred <- Function(fm)
# Get intercepts
alpha <- coef(fm)[1:3]
# Create a function that computes expit(linpred()) at representative covariate values
# for a given intercept
# This is the conditional probability that Y >= y given previous y = pstatus
pcond <- function(y, pstatus, treatment, day) 
  plogis(alpha[paste0('y>=', y)] + linpred(day=day, pstatus=pstatus, treatment=treatment,
                      age=repcov$age, lips=repcov$lips, charlson=repcov$charlson,
                      sofa=repcov$sofa))
# Test:
pp <- pcond('Home', 'In Hospital/Facility', treatment='Vitamin D', day=1:27)
round(pp, 3)
 [1] 0.080 0.088 0.095 0.103 0.110 0.116 0.121 0.122 0.120 0.114 0.106 0.096
[13] 0.086 0.077 0.068 0.061 0.055 0.051 0.048 0.047 0.046 0.047 0.048 0.050
[25] 0.052 0.054 0.057
# Compare with proportions of same transitions ignoring covariates and treatment
prop <- numeric(27)
for(t in 1:27) {
  i <- with(d, day == t & pstatus == 'In Hospital/Facility')
  cat(sum(is.na(i)), sum(i, na.rm=TRUE), ' ')
  prop[t] <- mean(d$status[i] == 'Home', na.rm=TRUE)
}
1 877  1 877  1 860  1 809  1 727  1 634  1 586  2 521  2 469  2 423  2 405  2 387  2 357  2 334  2 311  2 297  2 280  2 275  2 266  2 258  2 251  2 247  2 241  2 232  2 223  2 217  2 211  
prn(list(prop, pp), fi='/tmp/z')
plot(prop, pp, xlab='Sample Proportion Hosp -> Home',
       ylab='Covariate-Adjusted Model Estimate')
prn(0, fi='/tmp/z')
abline(a=0, b=1, col=gray(.85))

Simulate actual transitions for a large number of patients. Chained together this provides unconditional realizations of ordinal states.

# Sample 50,000 patients for each treatment and create outcomes for each day
N <- 50000
initial.state <- 'In Hospital/Facility'
tlev <- unique(d$treatment)
slev <- levels(d$status)
Y <- array(0L, dim=c(N, 27, 2), dimnames=list(NULL, paste('Day', 1 : 27), tlev))
if(alrsimdone) Y <- readRDS('alrsim.rds') else {
set.seed(1)
Sys.time()
for(i in 1 : N) {   # takes about 28m
  if(i %% 100 == 0) cat(i, file='/tmp/z')
  for(trt in tlev) {
    for(t in 1 : 27) {
      ps <- if(t == 1) initial.state else slev[Y[i, t - 1, trt]]
      if(ps == 'Dead') Y[i, t, trt] <- 1   # absorbing state
      else {
        # Get all 3 cumulative probabilities then 4 cell probabilities
        p1 <- pcond('Vent/ARDS',            ps, treatment=trt, day=t)
        p2 <- pcond('In Hospital/Facility', ps, treatment=trt, day=t)
        p3 <- pcond('Home',                 ps, treatment=trt, day=t)
        cellprobs <- c(1. - p1, p1 - p2, p2 - p3, p3)
        Y[i, t, trt] <- sample(1 : 4, 1, prob=cellprobs)
      }
    }
  }
}
Sys.time()
saveRDS(Y, 'alrsim.rds')
}

Plot some summaries of the simulated data.

# Compute proportion at home as a function of time and treatment
x <- apply(Y, 2:3, function(y) mean(y == 4))
phome <- list(Placebo    =list(Day=1:27, Proportion=x[, 'Placebo']),
             'Vitamin D'=list(Day=1:27, Proportion=x[, 'Vitamin D']))
labcurve(phome, pl=TRUE, ylab='Proportion at Home', col=c(1, 4))
apply(Y[, 27, ], 2, function(y) mean(y == 4))
  Placebo Vitamin D 
  0.82284   0.84240 
# Similarly for dead
x <- apply(Y, 2:3, function(y) mean(y == 1))
pdead <- list(Placebo    =list(Day=1:27, Proportion=x[, 'Placebo']),
             'Vitamin D'=list(Day=1:27, Proportion=x[, 'Vitamin D']))
labcurve(pdead, pl=TRUE, ylab='Proportion Dead', col=c(1, 4))
apply(Y[, 27, ], 2, function(y) mean(y == 1))
  Placebo Vitamin D 
  0.03748   0.03170 
# For comparison compute proportion of patients who died
with(subset(d, day==1), tapply(ddeath < 29, treatment, mean))
  Placebo Vitamin D 
0.1304348 0.1591241 
with(subset(dcf, day==27), tapply(status == 'Dead', treatment, mean, na.rm=TRUE))
  Placebo Vitamin D 
0.1308271 0.1576642 
# Compute treatment odds ratios as a function of time
odds <- function(x) { p <- mean(x, na.rm=TRUE); p / (1. - p)}

or2 <- apply(Y[,, 'Vitamin D'] >= 2, 2 , odds) / apply(Y[,, 'Placebo'] >= 2, 2, odds)
or3 <- apply(Y[,, 'Vitamin D'] >= 3, 2 , odds) / apply(Y[,, 'Placebo'] >= 3, 2, odds)
or4 <- apply(Y[,, 'Vitamin D'] >= 4, 2 , odds) / apply(Y[,, 'Placebo'] >= 4, 2, odds)

x <- list('Y>=2 (alive)' = list(1:27, or2), 'Y>=3 (hospital/home)' = list(1:27, or3),
          'Y=4 (home)' = list(1:27, or4))
labcurve(x, pl=TRUE, ylab='Odds Ratio Based on Cumulative Risk', xlab='Day', col=1:3)
abline(h=r <- exp(coef(fm)['treatment=Vitamin D']), col=gray(0.85))
text(27, r-0.02, 'Horizontal line is overall conditional\ntransition OR from unified PO model', adj=1)

# Compute median time to home, by treatment
mth <- function(y) if(all(y < 4)) Inf else min((1 : 27)[y == 4])
meda <- median(apply(Y[,, 'Placebo'],   1, mth))
medb <- median(apply(Y[,, 'Vitamin D'], 1, mth))
cat('Median days to home   Placebo:', meda, '  Vitamin D:', medb, '\n')
Median days to home   Placebo: 8   Vitamin D: 7 
# You can almost get this from the cumulative "at home" curves presented above
# Not exactly equal since those curves don't stop at first at-home point

Compute Unconditional Probability Analytically

Here we compute \(\Pr(Y_s \geq y | X, Y_0)\) analytically for a single value of \((y, Y_0, X, s)\), using this recursive formula where \(k\) is the number of outcome states and \(t\) is time:

\[\Pr(Y_{t} = y | X) = \sum_{j=1}^{k} \Pr(Y_{t} = y | X, Y_{t-1} = j) \Pr(Y_{t-1} = j | X)\]

# Note: This is made obsolete by the new Hmisc function soprobMarkovOrd
# Function to compute all cell probabilities P(Y(t) = y | Y(t-1) = pstatus), y=1, 2, .., k
pconds <- function(pstatus, treatment, day, ...) {
  k  <- length(alpha) + 1
  if(pstatus == 'Dead') return(c(1., rep(0., k - 1)))
  xb <- linpred(day=day, pstatus=pstatus, treatment=treatment, ...)
  cp <- plogis(alpha + xb)
  c(1., cp) - c(cp, 0.)
}

# Compute unconditional probabilities
# Output is a matrix with s rows corresponding to t=1, 2, ..., s
# Each row has columns corresponding to the probabilities of being in each of the states
uprob <- function(y0, s, treatment=treatment, ylevels=1 : k, ...) {
  k <- length(alpha) + 1
  P <- matrix(NA, nrow=s, ncol=k)
  colnames(P) <- as.character(ylevels)
  P[1, ] <- pconds(y0, treatment, day=1, ...)  # never uncondition on initial state
  # Matrix of conditional probabilities of Y conditioning on previous time Y
  # Columns = k conditional probabilities conditional on a single previous state
  # Rows    = all possible previous states
  cp <- matrix(NA, nrow=k, ncol=k, 
               dimnames=list(paste('t-1', ylevels), paste('t', ylevels)))
  for(t in 2 : s) {
    for(j in 1 : k) cp[j, ] <- pconds(ylevels[j], treatment=treatment, day=t, ...)
    # Compute unconditional probabilities of being in all possible states at current time t
    P[t, ] <- t(cp) %*% P[t - 1, ]
  }
  P
}

r <- repcov
U <- Uhome <- list()
for(trt in tlev) {   # separately for Placebo, Vitamin D
  u <- uprob('In Hospital/Facility', 27, treatment=trt, ylevels=slev,
             age=r$age, lips=r$lips, charlson=r$charlson, sofa=r$sofa)
  cat('\n', trt, '\n\n')
  print(round(u, 3))
  U[[trt]]     <- u
  Uhome[[trt]] <- list(day=1 : 27, p=u[, 4])
}

 Placebo 

       Dead Vent/ARDS In Hospital/Facility  Home
 [1,] 0.000     0.012                0.912 0.075
 [2,] 0.000     0.020                0.829 0.150
 [3,] 0.001     0.025                0.750 0.224
 [4,] 0.002     0.027                0.674 0.297
 [5,] 0.003     0.028                0.602 0.366
 [6,] 0.004     0.029                0.535 0.432
 [7,] 0.005     0.028                0.474 0.492
 [8,] 0.007     0.028                0.420 0.546
 [9,] 0.008     0.027                0.373 0.592
[10,] 0.009     0.026                0.334 0.631
[11,] 0.011     0.025                0.301 0.663
[12,] 0.012     0.024                0.274 0.689
[13,] 0.014     0.024                0.252 0.710
[14,] 0.015     0.024                0.234 0.727
[15,] 0.017     0.024                0.219 0.741
[16,] 0.018     0.024                0.206 0.752
[17,] 0.020     0.024                0.194 0.761
[18,] 0.022     0.025                0.184 0.769
[19,] 0.023     0.025                0.175 0.777
[20,] 0.025     0.025                0.166 0.783
[21,] 0.027     0.025                0.158 0.789
[22,] 0.029     0.025                0.151 0.795
[23,] 0.031     0.025                0.144 0.800
[24,] 0.033     0.024                0.138 0.805
[25,] 0.034     0.024                0.131 0.811
[26,] 0.036     0.023                0.125 0.816
[27,] 0.038     0.022                0.120 0.821

 Vitamin D 

       Dead Vent/ARDS In Hospital/Facility  Home
 [1,] 0.000     0.012                0.908 0.080
 [2,] 0.000     0.019                0.821 0.160
 [3,] 0.001     0.023                0.738 0.238
 [4,] 0.002     0.025                0.659 0.314
 [5,] 0.003     0.026                0.585 0.386
 [6,] 0.004     0.026                0.516 0.454
 [7,] 0.005     0.025                0.454 0.516
 [8,] 0.006     0.025                0.399 0.571
 [9,] 0.007     0.024                0.352 0.618
[10,] 0.008     0.023                0.313 0.657
[11,] 0.009     0.022                0.280 0.689
[12,] 0.010     0.021                0.254 0.714
[13,] 0.012     0.021                0.232 0.735
[14,] 0.013     0.021                0.215 0.752
[15,] 0.014     0.021                0.200 0.765
[16,] 0.015     0.021                0.188 0.776
[17,] 0.017     0.021                0.177 0.785
[18,] 0.018     0.021                0.167 0.793
[19,] 0.019     0.022                0.159 0.800
[20,] 0.021     0.022                0.151 0.806
[21,] 0.022     0.022                0.144 0.812
[22,] 0.024     0.022                0.137 0.818
[23,] 0.025     0.021                0.130 0.823
[24,] 0.027     0.021                0.124 0.828
[25,] 0.028     0.020                0.119 0.833
[26,] 0.030     0.019                0.113 0.838
[27,] 0.031     0.018                0.108 0.843

From the two matrices of unconditional probabilities one can see that the median time to home for placebo is 8 days and is 7 days for Vitamin D. Plot the unconditional probabilities of going home at day 1, 2, …, 27. Simulated probabilities from before are virtually identical to these estimates.

labcurve(Uhome, pl=TRUE, xlab='Day', ylab='Proportion at Home', col=c(1,4))
# labcurve(phome, add=TRUE)  # simulated values are fully superimposed on exact values

Compare the 27d mortality estimates to the raw data using the same initial state patients.

dcf[bstatus == 'In Hospital' & ! is.na(bstatus) & day == 27,
    mean(status == 'Dead', na.rm=TRUE), by=treatment]
   treatment         V1
1:   Placebo 0.09133489
2: Vitamin D 0.12472160
# These proportions don't match.  Let's use binary logistic regression to estimate the 
# probability of death at the covariate settings used above
dd <- datadist(dcf); options(datadist='dd')
f <- lrm(status == 'Dead' ~ treatment + rcs(age, 3) + lips +
           charlson + sofa, data=subset(dcf, day == 27 & bstatus=='In Hospital'))
r <- repcov
Predict(f, treatment, age=r$age, lips=r$lips,
        charlson=r$charlson, sofa=r$sofa, fun=plogis)
  treatment age lips charlson sofa       yhat      lower      upper
1   Placebo  55    7        3    5 0.05645739 0.03495262 0.08995972
2 Vitamin D  55    7        3    5 0.07742578 0.05055463 0.11682228

Response variable (y):  

Limits are 0.95 confidence limits

Use the exact calculations to get odds ratios and compare to the simulated odds ratios.

odds <- function(x) x / (1. - x)
or2 <- odds(rowSums(U$'Vitamin D'[, 2:4])) / odds(rowSums(U$Placebo[, 2:4]))
or3 <- odds(rowSums(U$'Vitamin D'[, 3:4])) / odds(rowSums(U$Placebo[, 3:4]))
or4 <- odds(U$'Vitamin D'[, 4]) / odds(U$Placebo[, 4])

x <- list('Y>=2 (alive)' = list(1:27, or2), 'Y>=3 (hospital/home)' = list(1:27, or3),
          'Y=4 (home)' = list(1:27, or4))
labcurve(x, pl=TRUE, ylab='Odds Ratio Based on Cumulative Risk', xlab='Day', col=1:3, ylim=c(1, 1.25))
abline(h=r <- exp(coef(fm)['treatment=Vitamin D']), col=gray(0.85))
text(27, r-0.02, 'Horizontal line is conditional\ntransition OR from unified PO model', adj=1)

The simulated ORs were unstable, especially for being alive, since the number of deaths is small.

Bayesian Markov Model

We parallel the regular proportional odds model with a Bayesian Markov proportional hazards model.

bmarkov <- blrm(status ~ rcs(day, 4) * pstatus + treatment + rcs(age, 3) +
                lips + charlson + sofa,
                data=d, subset=day < 28, refresh=100, file='bmarkov.rds')
htmlVerbatim(stanDx(bmarkov))

Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1)

                                      n_eff  Rhat
 y>=Vent/ARDS                          2760 1.001
 y>=In Hospital/Facility               2475 1.000
 y>=Home                               2615 1.001
 day                                   2967 1.000
 day'                                  3604 1.001
 day''                                 3245 1.000
 pstatus=In Hospital/Facility          1864 1.001
 pstatus=Home                          1942 1.001
 treatment=Vitamin D                   7792 1.000
 age                                   7440 0.999
 age'                                  6085 0.999
 lips                                  5704 0.999
 charlson                              7246 0.999
 sofa                                  5856 1.000
 day * pstatus=In Hospital/Facility    1957 1.000
 day' * pstatus=In Hospital/Facility   2434 1.001
 day'' * pstatus=In Hospital/Facility  3064 1.000
 day * pstatus=Home                    1659 1.000
 day' * pstatus=Home                   1712 1.001
 day'' * pstatus=Home                  2285 1.000
 
stanDxplot(bmarkov)
bmarkov

Bayesian Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.455 for Intercepts

 blrm(formula = status ~ rcs(day, 4) * pstatus + treatment + rcs(age, 
     3) + lips + charlson + sofa, data = d, subset = day < 28, 
     refresh = 100, file = "bmarkov.rds")
 

Frequencies of Responses

                 Dead            Vent/ARDS In Hospital/Facility 
                  188                 3169                10643 
                 Home 
                18088 
 
Frequencies of Missing Values Due to Each Variable
    status       day   pstatus treatment       age      lips  charlson      sofa 
        48         0        47         0         0         0      1010         0 
 
Mixed Calibration/
Discrimination Indexes
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 32088 B 0.006 [0.006, 0.006] g 5.828 [5.651, 6.022] C 0.983 [0.983, 0.983]
Draws 4000 gp 0.01 [0.009, 0.012] Dxy 0.966 [0.965, 0.966]
Chains 4 EV 0.058 [0.05, 0.067]
p 17 v 29.995 [28.242, 31.731]
vp 0 [0, 0]
Mean β Median β S.E. Lower Upper Pr(β>0) Symmetry
y≥Vent/ARDS   4.3537   4.3480  0.2124   3.9280   4.7682  1.0000  1.04
y≥In Hospital/Facility  -0.7574  -0.7596  0.1927  -1.1530  -0.3985  0.0000  1.00
y≥Home  -7.5974  -7.5987  0.2094  -8.0231  -7.1904  0.0000  1.01
day  -0.0984  -0.0982  0.0238  -0.1462  -0.0530  0.0000  0.99
day’   0.1113   0.1089  0.1029  -0.0975   0.3105  0.8672  1.01
day’’  -0.1662  -0.1589  0.2692  -0.7085   0.3535  0.2590  1.02
pstatus=In Hospital/Facility   5.5589   5.5593  0.1573   5.2600   5.8712  1.0000  1.00
pstatus=Home  16.0457  15.8795  1.9105  12.5905  19.8092  1.0000  1.26
treatment=Vitamin D   0.0684   0.0682  0.0519  -0.0337   0.1731  0.9112  1.04
age   0.0005   0.0005  0.0037  -0.0068   0.0078  0.5612  0.98
age’  -0.0052  -0.0052  0.0035  -0.0117   0.0022  0.0690  1.01
lips  -0.0241  -0.0240  0.0085  -0.0421  -0.0087  0.0018  0.97
charlson  -0.0640  -0.0640  0.0112  -0.0866  -0.0430  0.0000  1.01
sofa  -0.0379  -0.0379  0.0077  -0.0526  -0.0218  0.0000  0.99
day × pstatus=In Hospital/Facility   0.1909   0.1905  0.0297   0.1358   0.2514  1.0000  1.08
day’ × pstatus=In Hospital/Facility  -0.6157  -0.6128  0.1263  -0.8837  -0.3799  0.0000  0.98
day’’ × pstatus=In Hospital/Facility   1.3751   1.3711  0.3302   0.7608   2.0606  1.0000  1.00
day × pstatus=Home  -0.1314  -0.1189  0.2581  -0.6331   0.3693  0.3212  0.83
day’ × pstatus=Home   0.4429   0.4277  0.6599  -0.8189   1.7658  0.7465  1.07
day’’ × pstatus=Home  -0.9955  -0.9761  1.4228  -3.8657   1.7362  0.2462  0.95

Computing Environment

 R version 4.0.4 (2021-02-15)
 Platform: x86_64-pc-linux-gnu (64-bit)
 Running under: Pop!_OS 20.10
 
 Matrix products: default
 BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
 
 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base     
 
 other attached packages:
  [1] data.table_1.13.2    rstan_2.21.2         StanHeaders_2.21.0-6
  [4] rmsb_0.0.2           rms_6.1-2            SparseM_1.78        
  [7] Hmisc_4.5-0          ggplot2_3.3.2        Formula_1.2-4       
 [10] survival_3.2-7       lattice_0.20-41     
 
To cite R in publication use:

R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.