Longitudinal Ordinal Analysis for Violet2
Introduction
VIOLET 2 was a randomized clinical trial of seriously ill adults in ICUs to study the effectiveness of vitamin D vs. placebo. It has the advantage of having daily ordinal outcomes assessed for 28 days. The original NEJM paper focused on 1078 patients with confirmed baseline vitamin D deficiency. The analyses presented here use 1352 of the original 1360 randomized patients. Of the 1078 patients, 72 of 539 placebo patients died by day 28, and 92 of 531 vitamin D patients died (proportions of 0.13 and 0.17). Our starting dataset has 1359 patients, with death counts of 89 and 112 (proportions 0.13 and 0.16).
See this and related analyses of the ORCHID study and some additional analysis of VIOLET 2. See this for a study of modeling irregular time spacings with Markov models. Go here for a list of papers about Markov longitudinal modeling of binary and ordinal outcomes.
require(rmsb)
knitrSet(lang='markdown', w=7, h=7, dev='png', fig.path='png/stat-')
<- TRUE
alrsimdone Load(violet2)
<- violet2
d <- c('Home', 'Dead', 'In Post Hospital facility')
omit <- with(d, sum(day == 1 & bstatus %in% omit))
k <- subset(d, bstatus %nin% omit)
d <- d$status
oldstatus levels(d$status) <-
list(Dead='Dead',
'Vent/ARDS' = c('On Vent', 'ARDS'),
'In Hospital/Facility' = c('In Hospital',
'In Post Hospital Facility'),
Home = 'Home')
table(oldstatus, d$status)
##
## oldstatus Dead Vent/ARDS In Hospital/Facility Home
## Dead 3749 0 0 0
## ARDS 0 377 0 0
## On Vent 0 2966 0 0
## In Hospital 0 0 8191 0
## In Post Hospital Facility 0 0 2937 0
## Home 0 0 0 19586
$bstatus <- droplevels(d$bstatus)
dlabel(d$day) <- 'Day'
label(d$bstatus) <- 'Baseline Status'
<- d
dcf saveRDS(dcf, 'dcf.rds')
<- datadist(d); options(datadist='dd')
dd options(prType='html')
Descriptive Statistics
7 patients with day 0 status of dead, at home, or in post hospital facility were dropped.
html(describe(d))
12 Variables 37856 Observations
id: Reference Id
n | missing | distinct |
---|---|---|
37856 | 0 | 1352 |
lowest : | A01-00208 | A01-00210 | A01-00261 | A01-00337 | A01-00354 |
highest: | W05-00581 | W05-00589 | W05-00590 | W05-00591 | W05-00592 |
bstatus: Baseline Status
n | missing | distinct |
---|---|---|
37828 | 28 | 3 |
Value ARDS On Vent In Hospital Frequency 2828 10444 24556 Proportion 0.075 0.276 0.649
day: Day
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
37856 | 0 | 28 | 0.999 | 14.5 | 9.322 | 2.00 | 3.00 | 7.75 | 14.50 | 21.25 | 26.00 | 27.00 |
status
n | missing | distinct |
---|---|---|
37806 | 50 | 4 |
Value Dead Vent/ARDS In Hospital/Facility Frequency 3749 3343 11128 Proportion 0.099 0.088 0.294 Value Home Frequency 19586 Proportion 0.518
age: SCR age enrolled
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
37856 | 0 | 78 | 1 | 56.31 | 18.14 | 27.00 | 33.00 | 46.00 | 58.00 | 67.25 | 76.00 | 82.00 |
treatment
n | missing | distinct |
---|---|---|
37856 | 0 | 2 |
Value Placebo Vitamin D Frequency 18676 19180 Proportion 0.493 0.507
lips: Baseline LIPS Score
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
37856 | 0 | 28 | 0.996 | 5.175 | 3.399 | 0 | 0 | 3 | 5 | 7 | 9 | 10 |
charlson: Baseline Charlson Score
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
36680 | 1176 | 17 | 0.987 | 3.824 | 3.242 | 0 | 0 | 2 | 3 | 5 | 8 | 9 |
Value 0 1 2 3 4 5 6 7 8 9 10 11 Frequency 4816 4312 4648 4704 4760 4312 2800 2072 1708 896 644 280 Proportion 0.131 0.118 0.127 0.128 0.130 0.118 0.076 0.056 0.047 0.024 0.018 0.008 Value 12 13 14 15 18 Frequency 392 168 112 28 28 Proportion 0.011 0.005 0.003 0.001 0.001
sofa: SOFA Total Score
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
37856 | 0 | 19 | 0.993 | 5.382 | 4.028 | 0 | 1 | 3 | 5 | 8 | 10 | 12 |
Value 0 1 2 3 4 5 6 7 8 9 10 11 Frequency 2436 3444 3136 3808 4424 3752 3780 2968 2492 2128 1820 1540 Proportion 0.064 0.091 0.083 0.101 0.117 0.099 0.100 0.078 0.066 0.056 0.048 0.041 Value 12 13 14 15 16 17 18 Frequency 896 476 308 168 140 84 56 Proportion 0.024 0.013 0.008 0.004 0.004 0.002 0.001
vitD: LCMS Day 0
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
36400 | 1456 | 554 | 1 | 13.74 | 8.279 | 3.898 | 5.163 | 8.047 | 12.600 | 17.800 | 23.310 | 27.300 |
ddeath: Day of Death (29 if alive)
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
37856 | 0 | 28 | 0.375 | 26.23 | 4.918 | 5 | 13 | 29 | 29 | 29 | 29 | 29 |
dhome: Day Discharged to Home (NA if never)
n | missing | distinct | Info | Mean | Gmd | .05 | .10 | .25 | .50 | .75 | .90 | .95 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
26796 | 11060 | 28 | 0.994 | 8.199 | 6.851 | 2 | 2 | 4 | 6 | 11 | 18 | 25 |
Summarize deaths occurring after a patient arrives at home (there were no other events).
<- subset(d, ! is.na(dhome) & day >= dhome)
u <- length(unique(u$id))
np <- subset(u, status == 'Dead')
ud <- length(unique(ud$id))
nd cat(np, 'patients discharged to home\n',
'died post discharge\n') nd,
957 patients discharged to home
29 died post discharge
Event Chart for Random Sample of Patients
A status timeline is shown for each of 65 randomly chosen patients. The patients are sorted so that the worst outcomes at the last follow-up day are at the top. One day is subtract from the time variable to make follow-up start at day 0.
<- sample(unique(d$id), 65, FALSE)
ssamp <- subset(d, id %in% ssamp)
dr <- subset(dr, day <= ddeath)
dr $id <- as.integer(as.factor(dr$id))
dr$status <- factor(dr$status, levels=rev(levels(dr$status)))
dr$day <- dr$day - 1
drmultEventChart(status ~ day + id, data=dr,
absorb='Dead', sortbylast = TRUE) +
theme_classic() +
theme(legend.position='bottom')
Summary of Successive State Transitions
propsTrans(status ~ day + id, data=d, maxsize=6) +
theme(legend.position='bottom', axis.text.x=element_text(angle=90, hjust=1))
# For a nice event chart for this type of data see https://livefreeordichotomize.com/2020/05/21/survival-model-detective-1
Category Proportions Over Time and PO Assumption
The following plot shows how the mix of outcomes changes over time stratified also by treatment.
propsPO(status ~ treatment + day, data=subset(d, day < 15), nrow=1) +
theme(legend.position='bottom', axis.text.x=element_text(angle=90, hjust=1))
propsPO(status ~ treatment + day, data=subset(d, day > 14), nrow=1) +
theme(legend.position='bottom', axis.text.x=element_text(angle=90, hjust=1))
Ignoring treatment, compare the observed patterns with what would be expected had the proportional odds assumption been satisfied, fit a PO model containing only day
as a predictor, and allow the day
effect to be a flexible spline function. Then compute odds ratios against day 1 and apply those odds ratios to the observed proportions from day 1. This is for the original dataset with death carried forward.
<- lrm(status ~ rcs(day, 5), data=d)
f <- Function(f) # derive predicted log odds as a function of day
g # Make another function that gets the OR vs. day=1
<- function(day) exp(g(day) - g(1))
dor propsPO(status ~ day, odds.ratio=dor, data=d) +
theme(legend.position='bottom')
Carrying death forward results in a devastating violation of the PO assumption with respect to time. For example, the incidence of death keeps increasing while the incidence of vent/ARDS is decreasing. Now we repeat the process but terminating follow-up at death.
<- subset(d, day <= ddeath)
u <- lrm(status ~ rcs(day, 5), data=u)
f <- Function(f) # derive predicted log odds as a function of day
g <- function(day) exp(g(day) - g(1))
dor propsPO(status ~ day, odds.ratio=dor, data=u) +
theme(legend.position='bottom')
The non-PO with respect to time is less problematic when follow-up stops at death but continues for nonfatal events. Note that in the limiting case where death is the only event and the time intervals are small, such truncation results in a binary logistic odds ratio for treatment being almost exactly the hazard ratio in a Cox model analysis when interval risks are small. [However the absolute risk estimates from such a model as a function of time are hard to interpret.].
Note: The lower panel of the above graph is dependent on the choice of reference day. The proportions were estimated from day 1 alone. Instead let’s estimate them from day 4 and repeat.
<- function(day) exp(g(day) - g(4))
dor propsPO(status ~ day, odds.ratio=dor, ref=4, data=u) +
theme(legend.position='bottom')
Now proportional odds holds with respect to time.
Now do a similar analysis for the second most important predictor: baseline status, still truncating at death. ARDS is taken as the reference group for computing proportions in the follow-up status categories for the purpose of applying the PO assumption.
<- lrm(status ~ bstatus, data=u)
f <- Function(f) # derive predicted log odds as a function of bstatus
g <- function(bstatus) exp(g(bstatus) - g('ARDS'))
dor propsPO(status ~ bstatus, odds.ratio=dor, ref='ARDS', data=u)
Proportional odds is reasonably well satisfied with regard to baseline status except that for patients in hospital at baseline, without ARDS or vent, there is an overprediction of the proportion in the vent/ARDS category. Because these summaries are confounded with time, let’s repeat for day 3 only.
<- subset(u, day == 3)
u3 <- lrm(status ~ bstatus, data=u3)
f <- Function(f) # derive predicted log odds as a function of bstatus
g <- function(bstatus) exp(g(bstatus) - g('ARDS'))
dor propsPO(status ~ bstatus, odds.ratio=dor, ref='ARDS', data=u3)
See here for assessment of PO for other variables.
GEE-Type Proportional Odds Modeling
This analysis uses a working independence proportional odds model, then the variance-covariance matrix is corrected for within-patient correlation using the robust cluster sandwhich covariance estimator. Covariates are allowed to have nonlinear effects through the use of restricted cubic spline functions with 4 knots. A dot charts shows the relative importance of the predictors. Note that the lrm
function in the R rms
package frames the model in terms of \(P(Y \geq y | X)\) whereas most software uses \(P(Y \leq y | X)\).
There are several patients excluded in the analysis below due to missing baseline vitamin D levels.
<- lrm(status ~ bstatus + rcs(day, 5) + rcs(age, 5) + rcs(lips, 5) +
f rcs(charlson, 4) + rcs(sofa, 5) + rcs(vitD, 5) + treatment,
data=d, tol=1e-13, x=TRUE, y=TRUE)
<- robcov(f, d$id)
g plot(anova(g), trans=sqrt)
We reallocate degrees of freedom for the terms in the model so that the predictors with the most potential from the dot chart are given more parameters. This is because complex relationships that are week will not suffer from underfitting. The likelihood ratio and score \(\chi^2\) statistics in the output below should be ignored because they are not adjusted for intra-cluster correlation.
<- lrm(status ~ bstatus + rcs(day, 5) + rcs(age, 3) + rcs(lips, 3) +
f rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 3) + treatment,
data=d, x=TRUE, y=TRUE)
<- robcov(f, d$id)
g g
Logistic Regression Model
lrm(formula = status ~ bstatus + rcs(day, 5) + rcs(age, 3) + rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 3) + treatment, data = d, x = TRUE, y = TRUE)
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 3423 3043 10467 Home 18353
Frequencies of Missing Values Due to Each Variable
Model Likelihood Ratio Test |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 35286 | LR χ2 9851.12 | R2 0.271 | C 0.749 |
Cluster on d$id
|
d.f. 19 | g 1.208 | Dxy 0.498 |
Clusters 1261 | Pr(>χ2) <0.0001 | gr 3.348 | γ 0.498 |
max |∂log L/∂β| 1×10-10 | gp 0.158 | τa 0.311 | |
Brier 0.122 |
β | S.E. | Wald Z | Pr(>|Z|) | |
---|---|---|---|---|
y≥Vent/ARDS | 0.5629 | 0.4329 | 1.30 | 0.1935 |
y≥In Hospital/Facility | -0.3008 | 0.4376 | -0.69 | 0.4918 |
y≥Home | -2.1195 | 0.4404 | -4.81 | <0.0001 |
bstatus=On Vent | 0.4439 | 0.2116 | 2.10 | 0.0359 |
bstatus=In Hospital | 1.4338 | 0.2179 | 6.58 | <0.0001 |
day | 0.1833 | 0.0100 | 18.39 | <0.0001 |
day’ | -0.3621 | 0.0446 | -8.12 | <0.0001 |
day’’ | 0.5972 | 0.1172 | 5.10 | <0.0001 |
day’’’ | -0.1138 | 0.1323 | -0.86 | 0.3897 |
age | 0.0013 | 0.0081 | 0.16 | 0.8703 |
age’ | -0.0153 | 0.0075 | -2.04 | 0.0414 |
lips | -0.0397 | 0.0456 | -0.87 | 0.3837 |
lips’ | -0.0151 | 0.0469 | -0.32 | 0.7483 |
charlson | -0.1797 | 0.1106 | -1.63 | 0.1041 |
charlson’ | 0.3888 | 0.5018 | 0.77 | 0.4384 |
charlson’’ | -0.7933 | 0.9951 | -0.80 | 0.4253 |
sofa | 0.0212 | 0.0756 | 0.28 | 0.7791 |
sofa’ | -0.1943 | 0.2564 | -0.76 | 0.4485 |
sofa’’ | 0.4016 | 1.0363 | 0.39 | 0.6984 |
vitD | 0.0595 | 0.0175 | 3.41 | 0.0007 |
vitD’ | -0.0584 | 0.0201 | -2.91 | 0.0037 |
treatment=Vitamin D | 0.1169 | 0.0963 | 1.21 | 0.2249 |
anova(g)
Wald Statistics for status
|
|||
χ2 | d.f. | P | |
---|---|---|---|
bstatus | 79.00 | 2 | <0.0001 |
day | 482.46 | 4 | <0.0001 |
Nonlinear | 282.09 | 3 | <0.0001 |
age | 12.77 | 2 | 0.0017 |
Nonlinear | 4.16 | 1 | 0.0414 |
lips | 9.91 | 2 | 0.0071 |
Nonlinear | 0.10 | 1 | 0.7483 |
charlson | 26.63 | 3 | <0.0001 |
Nonlinear | 0.66 | 2 | 0.7175 |
sofa | 27.70 | 3 | <0.0001 |
Nonlinear | 4.75 | 2 | 0.0932 |
vitD | 12.27 | 2 | 0.0022 |
Nonlinear | 8.44 | 1 | 0.0037 |
treatment | 1.47 | 1 | 0.2249 |
TOTAL NONLINEAR | 293.51 | 10 | <0.0001 |
TOTAL | 786.13 | 19 | <0.0001 |
Partial effect plots below show the relationship between each predictor and log odds, holding other predictors to median or mode.
ggplot(Predict(g))
A nomogram representation of the model is below. The nomogram can be enhanced to show the estimated probability or outcome better than any given level.
<- nomogram(g, vnames='names')
n plot(n)
Fit With Only Linear Effects and Only for Day 14
<- subset(d, day == 14)
d14 <- lrm(status ~ bstatus + age + lips +
f + sofa + treatment, data=d14, x=TRUE, y=TRUE)
charlson f
Logistic Regression Model
lrm(formula = status ~ bstatus + age + lips + charlson + sofa + treatment, data = d14, x = TRUE, y = TRUE)
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 142 77 303 Home 786Frequencies of Missing Values Due to Each Variable
status bstatus age lips charlson sofa treatment 2 1 0 0 42 0 0
Model Likelihood Ratio Test |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 1308 | LR χ2 252.94 | R2 0.200 | C 0.712 |
max |∂log L/∂β| 9×10-7 | d.f. 7 | g 1.050 | Dxy 0.424 |
Pr(>χ2) <0.0001 | gr 2.858 | γ 0.424 | |
gp 0.139 | τa 0.242 | ||
Brier 0.116 |
β | S.E. | Wald Z | Pr(>|Z|) | |
---|---|---|---|---|
y≥Vent/ARDS | 3.1451 | 0.3644 | 8.63 | <0.0001 |
y≥In Hospital/Facility | 2.5659 | 0.3605 | 7.12 | <0.0001 |
y≥Home | 1.1570 | 0.3551 | 3.26 | 0.0011 |
bstatus=On Vent | 0.4889 | 0.2205 | 2.22 | 0.0266 |
bstatus=In Hospital | 1.3809 | 0.2197 | 6.28 | <0.0001 |
age | -0.0115 | 0.0045 | -2.57 | 0.0103 |
lips | -0.0487 | 0.0196 | -2.49 | 0.0129 |
charlson | -0.1248 | 0.0241 | -5.19 | <0.0001 |
sofa | -0.0947 | 0.0180 | -5.28 | <0.0001 |
treatment=Vitamin D | 0.2483 | 0.1156 | 2.15 | 0.0317 |
anova(f)
Wald Statistics for status
|
|||
χ2 | d.f. | P | |
---|---|---|---|
bstatus | 65.58 | 2 | <0.0001 |
age | 6.59 | 1 | 0.0103 |
lips | 6.19 | 1 | 0.0129 |
charlson | 26.89 | 1 | <0.0001 |
sofa | 27.85 | 1 | <0.0001 |
treatment | 4.61 | 1 | 0.0317 |
TOTAL | 225.54 | 7 | <0.0001 |
Evaluation of Baseline Measurement \(\times\) Time Interactions
In the following, allow time (simplified to 3 knots in the spline) to interact with each of the baseline variables, using fewer knots than before because of the much larger number of parameters being fitted.
<- lrm(status ~ rcs(day, 3) * (bstatus + rcs(age, 3) + rcs(lips, 3) +
f rcs(charlson, 3) + rcs(sofa, 3) + treatment),
data=d, tol=1e-13, x=TRUE, y=TRUE)
<- robcov(f, d$id)
g anova(g)
Wald Statistics for status
|
|||
χ2 | d.f. | P | |
---|---|---|---|
day (Factor+Higher Order Factors) | 560.38 | 24 | <0.0001 |
All Interactions | 124.64 | 22 | <0.0001 |
Nonlinear (Factor+Higher Order Factors) | 295.79 | 12 | <0.0001 |
bstatus (Factor+Higher Order Factors) | 238.83 | 6 | <0.0001 |
All Interactions | 4.41 | 4 | 0.3529 |
age (Factor+Higher Order Factors) | 11.07 | 6 | 0.0861 |
All Interactions | 8.80 | 4 | 0.0663 |
Nonlinear (Factor+Higher Order Factors) | 4.21 | 3 | 0.2400 |
lips (Factor+Higher Order Factors) | 22.96 | 6 | 0.0008 |
All Interactions | 2.65 | 4 | 0.6174 |
Nonlinear (Factor+Higher Order Factors) | 1.40 | 3 | 0.7059 |
charlson (Factor+Higher Order Factors) | 35.46 | 6 | <0.0001 |
All Interactions | 25.45 | 4 | <0.0001 |
Nonlinear (Factor+Higher Order Factors) | 4.35 | 3 | 0.2265 |
sofa (Factor+Higher Order Factors) | 53.37 | 6 | <0.0001 |
All Interactions | 16.99 | 4 | 0.0019 |
Nonlinear (Factor+Higher Order Factors) | 7.09 | 3 | 0.0692 |
treatment (Factor+Higher Order Factors) | 5.64 | 3 | 0.1303 |
All Interactions | 4.60 | 2 | 0.1001 |
day × bstatus (Factor+Higher Order Factors) | 4.41 | 4 | 0.3529 |
Nonlinear | 1.71 | 2 | 0.4261 |
Nonlinear Interaction : f(A,B) vs. AB | 1.71 | 2 | 0.4261 |
day × age (Factor+Higher Order Factors) | 8.80 | 4 | 0.0663 |
Nonlinear | 7.32 | 3 | 0.0624 |
Nonlinear Interaction : f(A,B) vs. AB | 7.32 | 3 | 0.0624 |
f(A,B) vs. Af(B) + Bg(A) | 3.49 | 1 | 0.0618 |
Nonlinear Interaction in day vs. Af(B) | 6.75 | 2 | 0.0343 |
Nonlinear Interaction in age vs. Bg(A) | 4.12 | 2 | 0.1277 |
day × lips (Factor+Higher Order Factors) | 2.65 | 4 | 0.6174 |
Nonlinear | 1.72 | 3 | 0.6329 |
Nonlinear Interaction : f(A,B) vs. AB | 1.72 | 3 | 0.6329 |
f(A,B) vs. Af(B) + Bg(A) | 0.86 | 1 | 0.3528 |
Nonlinear Interaction in day vs. Af(B) | 1.27 | 2 | 0.5309 |
Nonlinear Interaction in lips vs. Bg(A) | 1.30 | 2 | 0.5230 |
day × charlson (Factor+Higher Order Factors) | 25.45 | 4 | <0.0001 |
Nonlinear | 5.33 | 3 | 0.1495 |
Nonlinear Interaction : f(A,B) vs. AB | 5.33 | 3 | 0.1495 |
f(A,B) vs. Af(B) + Bg(A) | 0.49 | 1 | 0.4833 |
Nonlinear Interaction in day vs. Af(B) | 5.18 | 2 | 0.0752 |
Nonlinear Interaction in charlson vs. Bg(A) | 0.70 | 2 | 0.7042 |
day × sofa (Factor+Higher Order Factors) | 16.99 | 4 | 0.0019 |
Nonlinear | 14.80 | 3 | 0.0020 |
Nonlinear Interaction : f(A,B) vs. AB | 14.80 | 3 | 0.0020 |
f(A,B) vs. Af(B) + Bg(A) | 1.38 | 1 | 0.2400 |
Nonlinear Interaction in day vs. Af(B) | 12.02 | 2 | 0.0025 |
Nonlinear Interaction in sofa vs. Bg(A) | 6.19 | 2 | 0.0453 |
day × treatment (Factor+Higher Order Factors) | 4.60 | 2 | 0.1001 |
Nonlinear | 4.01 | 1 | 0.0451 |
Nonlinear Interaction : f(A,B) vs. AB | 4.01 | 1 | 0.0451 |
TOTAL NONLINEAR | 305.41 | 20 | <0.0001 |
TOTAL INTERACTION | 124.64 | 22 | <0.0001 |
TOTAL NONLINEAR + INTERACTION | 366.25 | 27 | <0.0001 |
TOTAL | 927.04 | 35 | <0.0001 |
From the ANOVA above you can see strong evidence for some form of interactions. Interactions between time and baseline SOFA score and time and the Charlson index stand out. The interactions indicate that the effect of these two baseline variables dissipates at later times. Let’s estimate the baseline covariate effects for days 1, 8, 15, 22 from the interaction model.
<- c(1, 8, 15, 22)
days ggplot(Predict(g, bstatus, day=days))
ggplot(Predict(g, age, day=days))
ggplot(Predict(g, lips, day=days))
ggplot(Predict(g, charlson, day=days))
ggplot(Predict(g, sofa, day=days))
ggplot(Predict(g, treatment, day=days))
ggplot(Predict(g, day, treatment))
Checking Proportional Odds Assumption
Before getting into a more detailed analysis concerning the proportional odds assumption involving multiple predictors, let’s take a different look at time alone (the approach described earlier is more intuitive). For each day, compute all cumulative proportions, take their logit, and see how the spacing between successive logit cumulative proportions varies across days. Plotting symbols are defined in the code. This may be easier to see in a table of simple cell frequencies that is also shown.
$yn <- as.integer(d$status) # numeric Y for easy dichotomization
dtable(d$status, d$yn)
1 2 3 4
Dead 3749 0 0 0
Vent/ARDS 0 3343 0 0
In Hospital/Facility 0 0 11128 0
Home 0 0 0 19586
<- function(y) {
ql <- min(0.9999, max(0.0001, mean(y, na.rm=TRUE)))
p qlogis(p)
}<- function(y) c(
sf 'y>=2'=ql(y >= 2), # box (alive)
'y>=3'=ql(y >= 3), # circle (not dead or Vent/ARDS)
'y>=4'=ql(y == 4)) # triangle (discharged)
<- summary(yn ~ day, fun=sf, continuous=99, data=d)
s plot(s, which=1:3, pch=0:4, xlab='logit Cumulative Prob', main='')
# g <- function(y)
# apply(y, 2, function(u) qlogis(mean(u)))
# s <- summary(cumcategory(yn) ~ day, continuous=99, data=d)
# plot(s, which=2:3, pch=1:5, xlab='logit Cumulative Prob.', main='')
with(d, print(table(day, status)))
status
day Dead Vent/ARDS In Hospital/Facility Home
1 16 421 877 37
2 31 319 860 141
3 49 258 809 235
4 63 226 727 335
5 76 202 634 439
6 84 169 586 512
7 96 153 521 580
8 107 130 469 644
9 113 123 423 691
10 121 115 405 709
11 125 104 387 734
12 135 93 357 765
13 140 87 334 789
14 147 82 311 810
15 151 79 297 823
16 157 78 280 835
17 162 72 275 841
18 166 69 266 849
19 167 67 258 858
20 170 66 251 863
21 170 66 247 867
22 173 63 241 873
23 178 62 232 878
24 182 62 223 883
25 186 61 217 886
26 193 59 211 887
27 195 57 204 894
28 196 0 226 928
In the following we ignore intra-patient correlation and use an all-linear (except for time) PO model for purposes of assessing the PO assumption on all covariates including time. The assessment is done by comparing \(\beta\)s with those from the PO model overall fit and those from all dichotomizations of Y. The analysis is done twice. First all patient records are considered. Then days after the day of death are removed. We pay more attention to the more important variables: day, baseline status, SOFA, and Charlson (in descending order of outcome variation explained). The day
effect is extremely nonlinear in the PO model, and day
is modeled as a quadratic effect here so it has two \(\beta\)s.
for(inclall in c(TRUE, FALSE)) {
cat(if(inclall) '\nAll records\n\n' else '\nRecords until death\n\n')
<- if(inclall) d else subset(d, day <= ddeath)
u <- lrm(status ~ bstatus + pol(day, 2) + age + lips + charlson + sofa +
f data=u)
treatment, <- matrix(NA, nrow=4, ncol=9)
k <- coef(f)[-(1:3)]
poc colnames(k) <- names(poc)
rownames(k) <- c('PO', paste0('>=', 2:4))
1, ] <- poc
k[for(cut in c(2 : 4)) {
<- lrm(yn >= cut ~ bstatus + pol(day, 2) + age + lips + charlson +
g + treatment, data=u)
sofa <- coef(g)[-1]
k[cut, ]
}print(round(k, 3))
}
All records
bstatus=On Vent bstatus=In Hospital day day^2 age lips charlson
PO 0.464 1.465 0.171 -0.004 -0.009 -0.051 -0.113
>=2 0.582 1.053 -0.183 0.004 -0.004 -0.106 -0.207
>=3 0.410 2.334 0.096 -0.003 -0.001 -0.047 -0.160
>=4 0.364 1.170 0.322 -0.008 -0.014 -0.042 -0.105
sofa treatment=Vitamin D
PO -0.089 0.154
>=2 -0.123 0.022
>=3 -0.076 0.046
>=4 -0.098 0.205
Records until death
bstatus=On Vent bstatus=In Hospital day day^2 age lips charlson
PO 0.326 1.600 0.284 -0.006 -0.015 -0.021 -0.045
>=2 0.490 0.772 0.080 -0.001 -0.006 -0.098 -0.177
>=3 0.271 3.868 0.237 -0.005 -0.003 -0.003 -0.044
>=4 0.193 1.001 0.345 -0.008 -0.018 -0.021 -0.049
sofa treatment=Vitamin D
PO -0.075 0.217
>=2 -0.100 -0.035
>=3 -0.034 0.102
>=4 -0.085 0.255
Constancy of the \(\beta\)s vertically in the above tables, and hence equality with the PO \(\beta\)s in the first row, means the PO assumption is satisfied. Not that PO is not strictly satisfied for any dataset we encounter, and even when PO is not satisfied, the PO model can still work better than alternative approaches.
When all records are analyzed (even those with duplicated death status post death), the coefficients are slightly more constant, except for the coefficient of day
. When death-carried-forward is not used, the coefficients are slightly less constant except for the effect of day
. To help explore that, here is an analysis of when deaths are occuring (patients not dying are removed):
<- subset(d, day == 1 & ddeath < 29)
w hist(w$ddeath, breaks=0:28, main='', xlab='Day of Death')
GEE-Like Analysis Repeated Without Carrying Death Forward
The last analysis with cluster correlation correction is repeated after deletion of patient records beyond the day of death. In other words, the death status is not repeated up to day 28 but the patient data are truncated at death.
<- dnc <- subset(d, day <= ddeath)
d <- lrm(status ~ bstatus + rcs(day, 5) + rcs(age, 3) + rcs(lips, 3) +
f rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 3) + treatment,
data=d, x=TRUE, y=TRUE)
<- robcov(f, d$id)
g g
Logistic Regression Model
lrm(formula = status ~ bstatus + rcs(day, 5) + rcs(age, 3) + rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 3) + treatment, data = d, x = TRUE, y = TRUE)
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 180 3043 10467 Home 18353
Frequencies of Missing Values Due to Each Variable
Model Likelihood Ratio Test |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 32043 | LR χ2 11418.02 | R2 0.354 | C 0.787 |
Cluster on d$id
|
d.f. 19 | g 1.506 | Dxy 0.574 |
Clusters 1261 | Pr(>χ2) <0.0001 | gr 4.506 | γ 0.575 |
max |∂log L/∂β| 1×10-7 | gp 0.119 | τa 0.319 | |
Brier 0.069 |
β | S.E. | Wald Z | Pr(>|Z|) | |
---|---|---|---|---|
y≥Vent/ARDS | 2.7293 | 0.4924 | 5.54 | <0.0001 |
y≥In Hospital/Facility | -0.6462 | 0.5038 | -1.28 | 0.1996 |
y≥Home | -3.2094 | 0.4970 | -6.46 | <0.0001 |
bstatus=On Vent | 0.3845 | 0.2626 | 1.46 | 0.1430 |
bstatus=In Hospital | 1.6675 | 0.2571 | 6.49 | <0.0001 |
day | 0.3020 | 0.0127 | 23.72 | <0.0001 |
day’ | -0.5754 | 0.0647 | -8.89 | <0.0001 |
day’’ | 0.9879 | 0.1814 | 5.45 | <0.0001 |
day’’’ | -0.2407 | 0.2103 | -1.14 | 0.2523 |
age | 0.0018 | 0.0094 | 0.19 | 0.8477 |
age’ | -0.0187 | 0.0083 | -2.26 | 0.0240 |
lips | -0.0468 | 0.0519 | -0.90 | 0.3676 |
lips’ | 0.0213 | 0.0469 | 0.45 | 0.6492 |
charlson | -0.1732 | 0.1384 | -1.25 | 0.2106 |
charlson’ | 0.6995 | 0.7050 | 0.99 | 0.3211 |
charlson’’ | -1.6179 | 1.6182 | -1.00 | 0.3174 |
sofa | 0.0392 | 0.1066 | 0.37 | 0.7129 |
sofa’ | -0.3083 | 0.3890 | -0.79 | 0.4280 |
sofa’’ | 0.6236 | 0.9470 | 0.66 | 0.5102 |
vitD | 0.0413 | 0.0193 | 2.14 | 0.0322 |
vitD’ | -0.0494 | 0.0229 | -2.16 | 0.0309 |
treatment=Vitamin D | 0.1920 | 0.1063 | 1.81 | 0.0708 |
anova(g)
Wald Statistics for status
|
|||
χ2 | d.f. | P | |
---|---|---|---|
bstatus | 108.29 | 2 | <0.0001 |
day | 1401.68 | 4 | <0.0001 |
Nonlinear | 391.54 | 3 | <0.0001 |
age | 17.32 | 2 | 0.0002 |
Nonlinear | 5.09 | 1 | 0.0240 |
lips | 1.86 | 2 | 0.3949 |
Nonlinear | 0.21 | 1 | 0.6492 |
charlson | 6.47 | 3 | 0.0908 |
Nonlinear | 1.00 | 2 | 0.6063 |
sofa | 19.39 | 3 | 0.0002 |
Nonlinear | 1.91 | 2 | 0.3856 |
vitD | 4.79 | 2 | 0.0913 |
Nonlinear | 4.66 | 1 | 0.0309 |
treatment | 3.26 | 1 | 0.0708 |
TOTAL NONLINEAR | 403.82 | 10 | <0.0001 |
TOTAL | 1621.75 | 19 | <0.0001 |
ggplot(Predict(g))
Let’s also look at the proportional odds assumption using these truncated records (death not carried forward).
<- summary(yn ~ day, fun=sf, continuous=99, data=d)
s plot(s, which=1:3, pch=0:4, xlab='logit Cumulative Prob', main='')
When the absorbing state is not carried forward, the problem with non-proportional odds almost vanishes.
Finally, look at interactions with time in this model.
<- lrm(status ~ rcs(day, 3) * (bstatus + rcs(age, 3) + rcs(lips, 3) +
f rcs(charlson, 3) + rcs(sofa, 3) + treatment),
data=d, tol=1e-13, x=TRUE, y=TRUE)
<- robcov(f, d$id)
g anova(g)
Wald Statistics for status
|
|||
χ2 | d.f. | P | |
---|---|---|---|
day (Factor+Higher Order Factors) | 1195.35 | 24 | <0.0001 |
All Interactions | 87.91 | 22 | <0.0001 |
Nonlinear (Factor+Higher Order Factors) | 400.49 | 12 | <0.0001 |
bstatus (Factor+Higher Order Factors) | 365.19 | 6 | <0.0001 |
All Interactions | 19.95 | 4 | 0.0005 |
age (Factor+Higher Order Factors) | 23.69 | 6 | 0.0006 |
All Interactions | 20.58 | 4 | 0.0004 |
Nonlinear (Factor+Higher Order Factors) | 5.92 | 3 | 0.1154 |
lips (Factor+Higher Order Factors) | 26.36 | 6 | 0.0002 |
All Interactions | 10.98 | 4 | 0.0268 |
Nonlinear (Factor+Higher Order Factors) | 2.62 | 3 | 0.4542 |
charlson (Factor+Higher Order Factors) | 15.46 | 6 | 0.0170 |
All Interactions | 6.11 | 4 | 0.1914 |
Nonlinear (Factor+Higher Order Factors) | 2.17 | 3 | 0.5381 |
sofa (Factor+Higher Order Factors) | 46.36 | 6 | <0.0001 |
All Interactions | 1.64 | 4 | 0.8011 |
Nonlinear (Factor+Higher Order Factors) | 4.72 | 3 | 0.1931 |
treatment (Factor+Higher Order Factors) | 4.66 | 3 | 0.1984 |
All Interactions | 1.91 | 2 | 0.3852 |
day × bstatus (Factor+Higher Order Factors) | 19.95 | 4 | 0.0005 |
Nonlinear | 12.96 | 2 | 0.0015 |
Nonlinear Interaction : f(A,B) vs. AB | 12.96 | 2 | 0.0015 |
day × age (Factor+Higher Order Factors) | 20.58 | 4 | 0.0004 |
Nonlinear | 8.33 | 3 | 0.0397 |
Nonlinear Interaction : f(A,B) vs. AB | 8.33 | 3 | 0.0397 |
f(A,B) vs. Af(B) + Bg(A) | 4.22 | 1 | 0.0399 |
Nonlinear Interaction in day vs. Af(B) | 6.98 | 2 | 0.0305 |
Nonlinear Interaction in age vs. Bg(A) | 5.87 | 2 | 0.0532 |
day × lips (Factor+Higher Order Factors) | 10.98 | 4 | 0.0268 |
Nonlinear | 2.81 | 3 | 0.4222 |
Nonlinear Interaction : f(A,B) vs. AB | 2.81 | 3 | 0.4222 |
f(A,B) vs. Af(B) + Bg(A) | 0.01 | 1 | 0.9409 |
Nonlinear Interaction in day vs. Af(B) | 0.25 | 2 | 0.8805 |
Nonlinear Interaction in lips vs. Bg(A) | 2.41 | 2 | 0.2997 |
day × charlson (Factor+Higher Order Factors) | 6.11 | 4 | 0.1914 |
Nonlinear | 4.78 | 3 | 0.1888 |
Nonlinear Interaction : f(A,B) vs. AB | 4.78 | 3 | 0.1888 |
f(A,B) vs. Af(B) + Bg(A) | 0.06 | 1 | 0.8138 |
Nonlinear Interaction in day vs. Af(B) | 3.92 | 2 | 0.1405 |
Nonlinear Interaction in charlson vs. Bg(A) | 1.31 | 2 | 0.5196 |
day × sofa (Factor+Higher Order Factors) | 1.64 | 4 | 0.8011 |
Nonlinear | 1.46 | 3 | 0.6922 |
Nonlinear Interaction : f(A,B) vs. AB | 1.46 | 3 | 0.6922 |
f(A,B) vs. Af(B) + Bg(A) | 0.01 | 1 | 0.9297 |
Nonlinear Interaction in day vs. Af(B) | 1.42 | 2 | 0.4911 |
Nonlinear Interaction in sofa vs. Bg(A) | 0.04 | 2 | 0.9792 |
day × treatment (Factor+Higher Order Factors) | 1.91 | 2 | 0.3852 |
Nonlinear | 1.86 | 1 | 0.1728 |
Nonlinear Interaction : f(A,B) vs. AB | 1.86 | 1 | 0.1728 |
TOTAL NONLINEAR | 420.41 | 20 | <0.0001 |
TOTAL INTERACTION | 87.91 | 22 | <0.0001 |
TOTAL NONLINEAR + INTERACTION | 455.33 | 27 | <0.0001 |
TOTAL | 1637.55 | 35 | <0.0001 |
<- c(1, 8, 15, 22)
days ggplot(Predict(g, bstatus, day=days))
ggplot(Predict(g, age, day=days))
ggplot(Predict(g, lips, day=days))
ggplot(Predict(g, charlson, day=days))
ggplot(Predict(g, sofa, day=days))
ggplot(Predict(g, treatment, day=days))
ggplot(Predict(g, day, treatment))
Assessment of Interaction Between Baseline Vitamin D and Treatment
We repeat the model above but allowing baseline vitamin D to interact with treatment.
<- lrm(status ~ bstatus + rcs(day, 5) + rcs(age, 3) + rcs(lips, 3) +
f rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 4) * treatment,
data=d, x=TRUE, y=TRUE)
<- robcov(f, d$id)
g g
Logistic Regression Model
lrm(formula = status ~ bstatus + rcs(day, 5) + rcs(age, 3) + rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + rcs(vitD, 4) * treatment, data = d, x = TRUE, y = TRUE)
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 180 3043 10467 Home 18353
Frequencies of Missing Values Due to Each Variable
Model Likelihood Ratio Test |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 32043 | LR χ2 11633.44 | R2 0.360 | C 0.790 |
Cluster on d$id
|
d.f. 23 | g 1.524 | Dxy 0.580 |
Clusters 1261 | Pr(>χ2) <0.0001 | gr 4.592 | γ 0.580 |
max |∂log L/∂β| 1×10-7 | gp 0.120 | τa 0.322 | |
Brier 0.069 |
β | S.E. | Wald Z | Pr(>|Z|) | |
---|---|---|---|---|
y≥Vent/ARDS | 2.3634 | 0.5685 | 4.16 | <0.0001 |
y≥In Hospital/Facility | -1.0222 | 0.5765 | -1.77 | 0.0762 |
y≥Home | -3.6015 | 0.5723 | -6.29 | <0.0001 |
bstatus=On Vent | 0.4329 | 0.2622 | 1.65 | 0.0987 |
bstatus=In Hospital | 1.7149 | 0.2571 | 6.67 | <0.0001 |
day | 0.3041 | 0.0128 | 23.70 | <0.0001 |
day’ | -0.5816 | 0.0650 | -8.94 | <0.0001 |
day’’ | 1.0016 | 0.1823 | 5.49 | <0.0001 |
day’’’ | -0.2506 | 0.2113 | -1.19 | 0.2355 |
age | 0.0029 | 0.0094 | 0.31 | 0.7600 |
age’ | -0.0193 | 0.0083 | -2.33 | 0.0197 |
lips | -0.0507 | 0.0521 | -0.97 | 0.3312 |
lips’ | 0.0260 | 0.0472 | 0.55 | 0.5823 |
charlson | -0.1829 | 0.1393 | -1.31 | 0.1893 |
charlson’ | 0.7225 | 0.7116 | 1.02 | 0.3099 |
charlson’’ | -1.6651 | 1.6355 | -1.02 | 0.3086 |
sofa | 0.0355 | 0.1074 | 0.33 | 0.7410 |
sofa’ | -0.2841 | 0.3896 | -0.73 | 0.4659 |
sofa’’ | 0.5528 | 0.9472 | 0.58 | 0.5595 |
vitD | 0.0686 | 0.0505 | 1.36 | 0.1742 |
vitD’ | -0.0298 | 0.2232 | -0.13 | 0.8937 |
vitD’’ | -0.1455 | 0.5537 | -0.26 | 0.7928 |
treatment=Vitamin D | 0.7066 | 0.5397 | 1.31 | 0.1905 |
vitD × treatment=Vitamin D | -0.0303 | 0.0757 | -0.40 | 0.6894 |
vitD’ × treatment=Vitamin D | -0.2233 | 0.3288 | -0.68 | 0.4971 |
vitD’’ × treatment=Vitamin D | 0.8314 | 0.8141 | 1.02 | 0.3071 |
anova(g)
Wald Statistics for status
|
|||
χ2 | d.f. | P | |
---|---|---|---|
bstatus | 110.06 | 2 | <0.0001 |
day | 1405.28 | 4 | <0.0001 |
Nonlinear | 388.96 | 3 | <0.0001 |
age | 17.09 | 2 | 0.0002 |
Nonlinear | 5.44 | 1 | 0.0197 |
lips | 1.84 | 2 | 0.3994 |
Nonlinear | 0.30 | 1 | 0.5823 |
charlson | 7.13 | 3 | 0.0678 |
Nonlinear | 1.04 | 2 | 0.5954 |
sofa | 20.01 | 3 | 0.0002 |
Nonlinear | 2.05 | 2 | 0.3594 |
vitD (Factor+Higher Order Factors) | 19.97 | 6 | 0.0028 |
All Interactions | 13.72 | 3 | 0.0033 |
Nonlinear (Factor+Higher Order Factors) | 19.95 | 4 | 0.0005 |
treatment (Factor+Higher Order Factors) | 17.48 | 4 | 0.0016 |
All Interactions | 13.72 | 3 | 0.0033 |
vitD × treatment (Factor+Higher Order Factors) | 13.72 | 3 | 0.0033 |
Nonlinear | 13.68 | 2 | 0.0011 |
Nonlinear Interaction : f(A,B) vs. AB | 13.68 | 2 | 0.0011 |
TOTAL NONLINEAR | 407.50 | 13 | <0.0001 |
TOTAL NONLINEAR + INTERACTION | 410.86 | 14 | <0.0001 |
TOTAL | 1634.16 | 23 | <0.0001 |
There is evidence for interaction. Plot the interaction effect.
ggplot(Predict(g, vitD, treatment))
To get the right confidence bands, form contrasts.
<- contrast(g, list(treatment='Vitamin D', vitD=0:65),
k list(treatment='Placebo', vitD=0:65))
<- as.data.frame(k[c('vitD', 'Contrast', 'Lower', 'Upper')])
k ggplot(k, aes(x=vitD, y=Contrast)) + geom_point() +
geom_errorbar(aes(ymin=Lower, ymax=Upper), width=0)
There seems to be benefit of treatment in the very low range of baseline vitamin D and for those patients with high baseline vitamin D, but not in the middle.
Bayesian Random Effects Model Without Carrying Death Forward
Repeat the last model but using a Bayesian approach with normally distributed random effects for patient level intercepts. This is a compound symmetry type of correlation structure within patient. Before proceeding witih the full model try to find the smallest model that exhibits convergence problems that will be seen later with the full model.
stanSet()
# This is fine:
<- blrm(status ~ age, data=d, loo=FALSE, refresh=100,
bnocsnoc file='bnocsnoc.rds')
stanDx(bnocsnoc)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
n_eff Rhat
y>=Vent/ARDS 4146 1.000
y>=In Hospital/Facility 4713 1.000
y>=Home 4700 0.999
age 4419 0.999
stanDxplot(bnocsnoc, which='ALL')
# Problem with this one if rsdmean=0.3 or 0.08, minor prob if 0.005
<- blrm(status ~ age + cluster(id), psigma=1, rsdsd=0.075,
bnocs data=d, loo=FALSE, refresh=100, file='bnocs.rds')
# Previous: psigma=2 rsdmean=0.005; psigma=1 rsdmean=0.08
stanDx(bnocs)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
n_eff Rhat
y>=Vent/ARDS 421 1.003
y>=In Hospital/Facility 194 1.011
y>=Home 173 1.011
age 199 1.009
sigmag 385 1.008
stanDxplot(bnocs, which='ALL')
# Problematic even if binary for Home, but got slightly better with
# t(4, 0, 1) prior for sigma SD
# adding conc=0.5 rsdmean made things significanty worse
# b <- blrm(status ~ age + cluster(id), data=d, iter=1000, refresh=100)
# stanDx(b); stanDxplot(b, which='ALL')
bnocs
Bayesian Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.455 for Intercepts
blrm(formula = status ~ age + cluster(id), data = d, psigma = 1, rsdsd = 0.075, refresh = 100, loo = FALSE, file = "bnocs.rds")
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 196 3343 11128 Home 19586Frequencies of Missing Values Due to Each Variable
status age cluster(id) 50 0 0
Mixed Calibration/ Discrimination Indexes |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 34253 | B 0.006 [0.006, 0.006] | g 0.742 [0.574, 0.952] | C 0.582 [0.582, 0.582] |
Draws 4000 | gp 0 [0, 0] | Dxy 0.163 [0.163, 0.163] | |
Chains 4 | EV 0 [0, 0] | ||
p 1 | v 0.435 [0.252, 0.7] | ||
Cluster on id
|
vp 0 [0, 0] | ||
Clusters 1351 | |||
σγ 3.2259 [3.0911, 3.3747] |
Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
---|---|---|---|---|---|---|---|
y≥Vent/ARDS | 11.4263 | 11.4129 | 0.3425 | 10.7745 | 12.1094 | 1.0000 | 1.07 |
y≥In Hospital/Facility | 6.3262 | 6.3142 | 0.3260 | 5.6944 | 6.9784 | 1.0000 | 1.04 |
y≥Home | 1.9749 | 1.9653 | 0.3233 | 1.3710 | 2.6387 | 1.0000 | 1.03 |
age | -0.0410 | -0.0408 | 0.0057 | -0.0533 | -0.0313 | 0.0000 | 0.95 |
<- blrm(status ~ bstatus + rcs(day, 5) + rcs(age, 3) +
bnocarry rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + treatment +
cluster(id), data=d, loo=FALSE, refresh=50, file='bnocarry.rds')
bnocarry
Bayesian Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.455 for Intercepts
blrm(formula = status ~ bstatus + rcs(day, 5) + rcs(age, 3) + rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + treatment + cluster(id), data = d, refresh = 50, loo = FALSE, file = "bnocarry.rds")
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 189 3169 10862 Home 18988Frequencies of Missing Values Due to Each Variable
status bstatus day age lips charlson 50 28 0 0 0 1045 sofa treatment cluster(id) 0 0 0
Mixed Calibration/ Discrimination Indexes |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 33208 | B 0.006 [0.006, 0.006] | g 4.077 [3.892, 4.272] | C 0.773 [0.77, 0.776] |
Draws 4000 | gp 0 [0, 0.001] | Dxy 0.546 [0.54, 0.553] | |
Chains 4 | EV 0.098 [0.015, 0.183] | ||
p 17 | v 13.183 [11.97, 14.401] | ||
Cluster on id
|
vp 0 [0, 0] | ||
Clusters 1309 | |||
σγ 4.1683 [3.9875, 4.3536] |
Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
---|---|---|---|---|---|---|---|
y≥Vent/ARDS | 8.2537 | 8.2419 | 0.9665 | 6.5110 | 10.2707 | 1.0000 | 1.01 |
y≥In Hospital/Facility | -0.2323 | -0.2381 | 0.9608 | -2.0608 | 1.6383 | 0.4040 | 1.02 |
y≥Home | -7.7042 | -7.7114 | 0.9637 | -9.6311 | -5.9360 | 0.0000 | 1.00 |
bstatus=On Vent | 0.9488 | 0.9359 | 0.5272 | -0.0824 | 1.9521 | 0.9635 | 1.00 |
bstatus=In Hospital | 4.7748 | 4.7693 | 0.5249 | 3.7904 | 5.8431 | 1.0000 | 1.07 |
day | 0.6975 | 0.6970 | 0.0169 | 0.6649 | 0.7316 | 1.0000 | 0.98 |
day’ | -1.1754 | -1.1746 | 0.1039 | -1.3757 | -0.9721 | 0.0000 | 1.01 |
day’’ | 1.8060 | 1.8027 | 0.3214 | 1.1876 | 2.4208 | 1.0000 | 1.00 |
day’’’ | -0.0027 | 0.0056 | 0.4316 | -0.8599 | 0.8159 | 0.5042 | 1.00 |
age | -0.0003 | 0.0000 | 0.0186 | -0.0371 | 0.0366 | 0.4990 | 1.03 |
age’ | -0.0287 | -0.0291 | 0.0166 | -0.0595 | 0.0053 | 0.0423 | 0.98 |
lips | -0.0262 | -0.0269 | 0.1091 | -0.2349 | 0.1944 | 0.3985 | 1.07 |
lips’ | -0.0787 | -0.0782 | 0.1009 | -0.2673 | 0.1185 | 0.2272 | 0.97 |
charlson | -0.4306 | -0.4217 | 0.2920 | -1.0024 | 0.1340 | 0.0685 | 0.99 |
charlson’ | 1.5518 | 1.5217 | 1.4831 | -1.5054 | 4.4033 | 0.8562 | 1.06 |
charlson’’ | -3.6542 | -3.5987 | 3.4116 | -9.9032 | 3.5933 | 0.1395 | 0.94 |
sofa | 0.0069 | 0.0120 | 0.1872 | -0.3533 | 0.3709 | 0.5212 | 0.89 |
sofa’ | -0.3965 | -0.4190 | 0.7343 | -1.8090 | 1.0533 | 0.2828 | 1.11 |
sofa’’ | 0.3921 | 0.4177 | 1.8166 | -3.0990 | 4.0762 | 0.5980 | 0.90 |
treatment=Vitamin D | 0.4395 | 0.4486 | 0.2436 | -0.0398 | 0.9269 | 0.9575 | 0.98 |
<- anova(bnocarry)
a a
Relative Explained Variation for status . Approximate total model Wald χ2 used in denominators of REV:6411.9 [6129.9, 6754.7].
|
||||
REV | Lower | Upper | d.f. | |
---|---|---|---|---|
bstatus | 0.029 | 0.023 | 0.039 | 2 |
day | 0.977 | 0.968 | 0.983 | 4 |
Nonlinear | 0.246 | 0.226 | 0.265 | 3 |
age | 0.002 | 0.000 | 0.004 | 2 |
Nonlinear | 0.000 | 0.000 | 0.002 | 1 |
lips | 0.001 | 0.000 | 0.003 | 2 |
Nonlinear | 0.000 | 0.000 | 0.001 | 1 |
charlson | 0.002 | 0.000 | 0.005 | 3 |
Nonlinear | 0.000 | 0.000 | 0.001 | 2 |
sofa | 0.007 | 0.004 | 0.011 | 3 |
Nonlinear | 0.001 | 0.000 | 0.003 | 2 |
treatment | 0.001 | 0.000 | 0.002 | 1 |
TOTAL NONLINEAR | 0.248 | 0.229 | 0.269 | 9 |
TOTAL | 1.000 | 1.000 | 1.000 | 17 |
stanDx(bnocarry)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
n_eff Rhat
y>=Vent/ARDS 336 1.008
y>=In Hospital/Facility 138 1.026
y>=Home 101 1.038
bstatus=On Vent 110 1.042
bstatus=In Hospital 159 1.010
day 1357 1.000
day' 2714 1.001
day'' 3691 1.001
day''' 4264 1.000
age 141 1.032
age' 163 1.022
lips 121 1.038
lips' 52 1.076
charlson 126 1.027
charlson' 141 1.015
charlson'' 122 1.028
sofa 143 1.024
sofa' 140 1.024
sofa'' 159 1.017
treatment=Vitamin D 130 1.039
sigmag 264 1.005
stanDxplot(bnocarry)
ggplot(Predict(bnocarry))
plot(a)
Compute the posterior probability that treatment raises the chance of a patient having a given outcome or better (this is already in the coefficient table above).
<- PostF(bnocarry, name='orig')
P P(`treatment=Vitamin D` > 0)
[1] 0.9575
For comparison, fit a similar Bayesian model adjusting only for time, and one not adjusting for any covariates.
<- blrm(status ~ rcs(day, 5) + treatment +
bnocarry2 cluster(id), data=d, loo=FALSE, refresh=50, file='bnocarry2.rds')
bnocarry2
Bayesian Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.455 for Intercepts
blrm(formula = status ~ rcs(day, 5) + treatment + cluster(id), data = d, refresh = 50, loo = FALSE, file = "bnocarry2.rds")
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 196 3343 11128 Home 19586Frequencies of Missing Values Due to Each Variable
status day treatment cluster(id) 50 0 0 0
Mixed Calibration/ Discrimination Indexes |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 34253 | B 0.006 [0.006, 0.006] | g 2.627 [2.547, 2.691] | C 0.711 [0.71, 0.711] |
Draws 4000 | gp 0 [0, 0] | Dxy 0.422 [0.42, 0.423] | |
Chains 4 | EV 0 [0, 0] | ||
p 5 | v 5.784 [5.503, 6.074] | ||
Cluster on id
|
vp 0 [0, 0] | ||
Clusters 1351 | |||
σγ 5.0107 [4.7918, 5.2634] |
Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
---|---|---|---|---|---|---|---|
y≥Vent/ARDS | 9.2229 | 9.2242 | 0.2474 | 8.7477 | 9.7103 | 1.0000 | 0.99 |
y≥In Hospital/Facility | 0.7985 | 0.7963 | 0.2049 | 0.4063 | 1.1985 | 1.0000 | 0.94 |
y≥Home | -6.5560 | -6.5525 | 0.2136 | -6.9970 | -6.1586 | 0.0000 | 0.93 |
day | 0.6986 | 0.6988 | 0.0163 | 0.6660 | 0.7290 | 1.0000 | 1.02 |
day’ | -1.2033 | -1.2027 | 0.1006 | -1.4030 | -1.0111 | 0.0000 | 0.96 |
day’’ | 1.8840 | 1.8821 | 0.3113 | 1.2824 | 2.4896 | 1.0000 | 1.02 |
day’’’ | -0.0628 | -0.0655 | 0.4169 | -0.8835 | 0.7454 | 0.4395 | 1.02 |
treatment=Vitamin D | 0.2469 | 0.2403 | 0.2630 | -0.3019 | 0.7672 | 0.8315 | 1.11 |
<- blrm(status ~ treatment + cluster(id),
bnocarry3 data=d, loo=FALSE, refresh=50, file='bnocarry3.rds')
bnocarry3
Bayesian Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.455 for Intercepts
blrm(formula = status ~ treatment + cluster(id), data = d, refresh = 50, loo = FALSE, file = "bnocarry3.rds")
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 196 3343 11128 Home 19586Frequencies of Missing Values Due to Each Variable
status treatment cluster(id) 50 0 0
Mixed Calibration/ Discrimination Indexes |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 34253 | B 0.006 [0.006, 0.006] | g 0.088 [0, 0.226] | C 0.511 [0.473, 0.527] |
Draws 4000 | gp 0 [0, 0] | Dxy 0.023 [-0.053, 0.053] | |
Chains 4 | EV 0 [0, 0] | ||
p 1 | v 0.012 [0, 0.051] | ||
Cluster on id
|
vp 0 [0, 0] | ||
Clusters 1351 | |||
σγ 3.2963 [3.1478, 3.4361] |
Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
---|---|---|---|---|---|---|---|
y≥Vent/ARDS | 9.0687 | 9.0673 | 0.1694 | 8.7362 | 9.3921 | 1.0000 | 1.00 |
y≥In Hospital/Facility | 3.9605 | 3.9591 | 0.1361 | 3.7138 | 4.2406 | 1.0000 | 1.06 |
y≥Home | -0.3899 | -0.3909 | 0.1324 | -0.6394 | -0.1336 | 0.0008 | 0.99 |
treatment=Vitamin D | 0.1152 | 0.1155 | 0.1948 | -0.2774 | 0.4813 | 0.7225 | 0.94 |
Adjusting only for time and not other baseline covariates, the between-patient outcome heterogeneity is greater as makes sense, and the treatment effect remains diminished. When not even adjusting for time, the value of \(\sigma_\gamma\) ironically becomes lower than from the model with full covariate adjustment, and the treatment effect becomes very small.
Bayesian PO Random Effects Model With Time by Treatment Interaction
We go back to adjusting for the full set of covariates, and allow time to nonlinearly interact with treatment.
<- blrm(status ~ bstatus + treatment * rcs(day, 4) + rcs(age, 3) +
btxtime rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) +
cluster(id), data=d, loo=FALSE, refresh=50, file='btxtime.rds')
stanDx(btxtime)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
n_eff Rhat
y>=Vent/ARDS 309 1.013
y>=In Hospital/Facility 149 1.022
y>=Home 120 1.023
bstatus=On Vent 142 1.027
bstatus=In Hospital 143 1.020
treatment=Vitamin D 154 1.011
day 1979 1.005
day' 3281 1.000
day'' 3943 1.001
age 131 1.026
age' 119 1.031
lips 67 1.064
lips' 149 1.012
charlson 144 1.038
charlson' 155 1.033
charlson'' 79 1.066
sofa 168 1.017
sofa' 164 1.011
sofa'' 104 1.053
treatment=Vitamin D * day 3919 1.000
treatment=Vitamin D * day' 3880 1.000
treatment=Vitamin D * day'' 4018 1.000
sigmag 298 1.010
stanDxplot(btxtime)
btxtime
Bayesian Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.455 for Intercepts
blrm(formula = status ~ bstatus + treatment * rcs(day, 4) + rcs(age, 3) + rcs(lips, 3) + rcs(charlson, 4) + rcs(sofa, 4) + cluster(id), data = d, refresh = 50, loo = FALSE, file = "btxtime.rds")
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 189 3169 10862 Home 18988Frequencies of Missing Values Due to Each Variable
status bstatus treatment day age lips 50 28 0 0 0 0 charlson sofa cluster(id) 1045 0 0
Mixed Calibration/ Discrimination Indexes |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 33208 | B 0.006 [0.006, 0.006] | g 4.048 [3.838, 4.234] | C 0.777 [0.773, 0.78] |
Draws 4000 | gp 0 [0, 0.001] | Dxy 0.553 [0.546, 0.559] | |
Chains 4 | EV 0.093 [0.017, 0.182] | ||
p 19 | v 12.993 [11.63, 14.157] | ||
Cluster on id
|
vp 0 [0, 0] | ||
Clusters 1309 | |||
σγ 4.1813 [4.005, 4.3861] |
Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
---|---|---|---|---|---|---|---|
y≥Vent/ARDS | 8.3971 | 8.3957 | 1.0256 | 6.3030 | 10.3725 | 1.0000 | 1.08 |
y≥In Hospital/Facility | -0.0842 | -0.0892 | 1.0181 | -2.2243 | 1.8647 | 0.4602 | 1.06 |
y≥Home | -7.5591 | -7.5620 | 1.0209 | -9.6862 | -5.6048 | 0.0000 | 1.06 |
bstatus=On Vent | 0.9108 | 0.9083 | 0.5018 | -0.0981 | 1.9025 | 0.9618 | 0.99 |
bstatus=In Hospital | 4.6828 | 4.6871 | 0.5175 | 3.6220 | 5.6637 | 1.0000 | 0.97 |
treatment=Vitamin D | 0.3346 | 0.3394 | 0.2428 | -0.1250 | 0.8277 | 0.9130 | 0.95 |
day | 0.6643 | 0.6642 | 0.0173 | 0.6308 | 0.6991 | 1.0000 | 1.00 |
day’ | -0.8991 | -0.8994 | 0.0566 | -1.0131 | -0.7927 | 0.0000 | 1.02 |
day’’ | 1.8535 | 1.8551 | 0.1629 | 1.5388 | 2.1808 | 1.0000 | 0.99 |
age | -0.0009 | -0.0008 | 0.0205 | -0.0401 | 0.0398 | 0.4825 | 1.07 |
age’ | -0.0293 | -0.0294 | 0.0183 | -0.0636 | 0.0076 | 0.0530 | 0.97 |
lips | -0.0291 | -0.0297 | 0.1142 | -0.2522 | 0.2005 | 0.3952 | 0.99 |
lips’ | -0.0799 | -0.0807 | 0.1080 | -0.2896 | 0.1390 | 0.2225 | 0.99 |
charlson | -0.4172 | -0.4136 | 0.3004 | -1.0080 | 0.1831 | 0.0795 | 0.95 |
charlson’ | 1.5760 | 1.5553 | 1.5598 | -1.3238 | 5.0165 | 0.8522 | 1.04 |
charlson’’ | -3.7606 | -3.6762 | 3.5813 | -11.9324 | 2.6047 | 0.1363 | 0.97 |
sofa | 0.0327 | 0.0251 | 0.1928 | -0.3203 | 0.4309 | 0.5572 | 1.06 |
sofa’ | -0.4858 | -0.4620 | 0.7517 | -1.9843 | 0.8965 | 0.2780 | 0.96 |
sofa’’ | 0.6131 | 0.5621 | 1.8695 | -2.8167 | 4.3160 | 0.6150 | 1.05 |
treatment=Vitamin D × day | 0.0194 | 0.0194 | 0.0227 | -0.0254 | 0.0643 | 0.8112 | 0.98 |
treatment=Vitamin D × day’ | -0.0317 | -0.0311 | 0.0790 | -0.1862 | 0.1207 | 0.3440 | 0.99 |
treatment=Vitamin D × day’’ | 0.0100 | 0.0070 | 0.2293 | -0.4326 | 0.4518 | 0.5112 | 1.03 |
Compute treatment contrasts (log odds ratios) as a function of time.
<- contrast(btxtime, list(treatment='Vitamin D', day=1:28),
k list(treatment='Placebo', day=1:28))
k
bstatus day age lips charlson sofa Contrast S.E. Lower
1 In Hospital 1 58 5 3 5 0.3540591 0.2338975 -0.129725199
2 In Hospital 2 58 5 3 5 0.3734883 0.2269386 -0.077890340
3 In Hospital 3 58 5 3 5 0.3928668 0.2221128 -0.066133606
4* In Hospital 4 58 5 3 5 0.4119408 0.2195306 -0.030094376
5* In Hospital 5 58 5 3 5 0.4304057 0.2190076 -0.006784936
6* In Hospital 6 58 5 3 5 0.4479572 0.2200671 -0.002051644
7* In Hospital 7 58 5 3 5 0.4642907 0.2220373 0.006342900
8* In Hospital 8 58 5 3 5 0.4791018 0.2241803 0.021734510
9* In Hospital 9 58 5 3 5 0.4920858 0.2258263 0.022204824
10* In Hospital 10 58 5 3 5 0.5029385 0.2265063 0.069770881
11 In Hospital 11 58 5 3 5 0.5113711 0.2261697 0.081901485
12* In Hospital 12 58 5 3 5 0.5171591 0.2254095 0.079327087
13* In Hospital 13 58 5 3 5 0.5200937 0.2249417 0.092832635
14* In Hospital 14 58 5 3 5 0.5199662 0.2252829 0.092554319
15* In Hospital 15 58 5 3 5 0.5165680 0.2265961 0.075674492
16* In Hospital 16 58 5 3 5 0.5096903 0.2286418 0.079243684
17* In Hospital 17 58 5 3 5 0.4991245 0.2308418 0.057896195
18* In Hospital 18 58 5 3 5 0.4846618 0.2324388 0.034690217
19* In Hospital 19 58 5 3 5 0.4662043 0.2328201 -0.022454378
20* In Hospital 20 58 5 3 5 0.4440974 0.2320473 -0.048543807
21* In Hospital 21 58 5 3 5 0.4187972 0.2306030 -0.065062469
22* In Hospital 22 58 5 3 5 0.3907600 0.2291814 -0.083756952
23* In Hospital 23 58 5 3 5 0.3604420 0.2285721 -0.093189231
24* In Hospital 24 58 5 3 5 0.3282992 0.2295390 -0.131587359
25* In Hospital 25 58 5 3 5 0.2947879 0.2326931 -0.151226537
26* In Hospital 26 58 5 3 5 0.2603643 0.2383827 -0.212094615
27* In Hospital 27 58 5 3 5 0.2254845 0.2466329 -0.251975767
28* In Hospital 28 58 5 3 5 0.1905286 0.2571906 -0.295994535
Upper Pr(Contrast>0)
1 0.7925092 0.9318
2 0.8223911 0.9455
3 0.8172218 0.9552
4* 0.8372061 0.9630
5* 0.8594823 0.9698
6* 0.8690354 0.9750
7* 0.8790902 0.9782
8* 0.9041111 0.9815
9* 0.9084629 0.9842
10* 0.9594916 0.9860
11 0.9675514 0.9880
12* 0.9614458 0.9898
13* 0.9749873 0.9895
14* 0.9733888 0.9888
15* 0.9639879 0.9870
16* 0.9742668 0.9835
17* 0.9649165 0.9802
18* 0.9499455 0.9778
19* 0.8948266 0.9712
20* 0.8629274 0.9658
21* 0.8381119 0.9600
22* 0.8120379 0.9490
23* 0.7995944 0.9382
24* 0.7616194 0.9175
25* 0.7460600 0.8935
26* 0.7064714 0.8582
27* 0.7005702 0.8185
28* 0.6997217 0.7748
Redundant contrasts are denoted by *
Intervals are 0.95 highest posterior density intervals
Contrast is the posterior mean
There is a high probability of a treatment effect from day 1 - day 24.
Bayesian Partial Proportional Odds Random Effects Model
Again analyze the “death not carried forward” dataset. Now we relax the proportional hazards assumption with regard to time and treatment. We use the Peterson and Harrell (1990) partial proportional odds model (PPO) which is akin to a Cox model with time-dependent covariates. In general in a longitudinal study one might expect time to not satisfy the PO assumption because the mix of events often changes over time. The departure from PO for time is assumed to be linear whereas the main effect of time uses a 5-knot restricted cubic spline function.
# Check against vgam when clustering is ignored
# Good agreement
if(FALSE) {
require(VGAM)
<- ordered(d$status)
st <- vgam(st ~ age + treatment, data=d,
f cumulative(reverse=TRUE, parallel=FALSE))
print(coef(f))
<- blrm(status ~ age + treatment, ~ age + treatment, data=d, refresh=100)
b
b
}
# Check variance of random effects against mixor
# Don't bother. Program would not finish even after 2 hours
# require(mixor)
# d$st <- ordered(d$status)
# f <- mixor(st ~ age + day + treatment, id=id, link='logit', data=d)
#f
# sqrt(diag(vcov(f)))
#
<- blrm(status ~ bstatus + rcs(age, 4) + rcs(day, 5) +
bncppo + cluster(id),
treatment ~ day + treatment, # PPO part of model
priorsdppo=0.5, seed=5,
data=d, loo=FALSE, iter=3000, refresh=100,
ppairs='bncppo-pairs.png', file='bncppo.rds')
stanDx(bncppo)
Iterations: 3000 on each of 4 chains, with 6000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
n_eff Rhat
y>=Vent/ARDS 480 1.005
y>=In Hospital/Facility 213 1.021
y>=Home 166 1.028
bstatus=On Vent 226 1.007
bstatus=In Hospital 256 1.019
age 149 1.041
age' 225 1.030
age'' 230 1.017
day 3887 1.000
day' 6230 1.000
day'' 7603 1.000
day''' 8151 1.000
treatment=Vitamin D 166 1.024
day:y>=In Hospital/Facility 367 1.010
treatment=Vitamin D:y>=In Hospital/Facility 8989 1.000
day:y>=Home 7463 1.000
treatment=Vitamin D:y>=Home 9828 1.000
sigmag 7349 1.000
stanDxplot(bncppo)
# Tx effects
# Default priors for ppo: -0.1338 0.4077 0.5113 prob 0.35 0.83 0.91
# sd=0.75 -0.0565 0.3334 0.4256 prob 0.44 0.84 0.91
# sd=0.5 -0.0109 0.2714 0.3550 prob 0.48 0.85 0.93
# sd=0.5 t prior -0.0267 0.2783 0.3599 prob 0.46 0.82 0.90 s=4.61
# sd=0.3 0.0933 0.1746 0.2459 prob 0.63 0.84 0.90
# Above results for t prior no better convergence than with exp
::include_graphics('bncppo-pairs.png') knitr
bncppo
Bayesian Partial Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.455 for Intercepts
blrm(formula = status ~ bstatus + rcs(age, 4) + rcs(day, 5) + treatment + cluster(id), ppo = ~day + treatment, data = d, priorsdppo = 0.5, iter = 3000, refresh = 100, loo = FALSE, ppairs = "bncppo-pairs.png", file = "bncppo.rds", seed = 5)
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 196 3343 11128 Home 19586Frequencies of Missing Values Due to Each Variable
status bstatus age day treatment cluster(id) 50 28 0 0 0 0
Mixed Calibration/ Discrimination Indexes |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 34253 | B 0.006 [0.006, 0.006] | g 3.901 [3.723, 4.125] | C 0.769 [0.767, 0.772] |
Draws 6000 | gp 0 [0, 0] | Dxy 0.538 [0.533, 0.543] | |
Chains 4 | EV 0.004 [0.001, 0.011] | ||
p 10 | v 11.986 [10.925, 13.398] | ||
Cluster on id
|
vp 0 [0, 0] | ||
Clusters 1351 | |||
σγ 4.2786 [4.0909, 4.4736] |
Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
---|---|---|---|---|---|---|---|
y≥Vent/ARDS | 6.3470 | 6.3430 | 1.1994 | 3.8172 | 8.6190 | 1.0000 | 0.96 |
y≥In Hospital/Facility | -2.1308 | -2.1279 | 1.1978 | -4.4000 | 0.4173 | 0.0370 | 0.96 |
y≥Home | -9.4880 | -9.4819 | 1.2005 | -11.8953 | -7.1136 | 0.0000 | 0.95 |
bstatus=On Vent | 1.6367 | 1.6440 | 0.4829 | 0.6848 | 2.5995 | 0.9997 | 0.94 |
bstatus=In Hospital | 6.0539 | 6.0544 | 0.4562 | 5.1634 | 6.9694 | 1.0000 | 0.97 |
age | -0.0099 | -0.0097 | 0.0307 | -0.0734 | 0.0488 | 0.3675 | 1.02 |
age’ | -0.0968 | -0.0981 | 0.0705 | -0.2268 | 0.0548 | 0.0868 | 1.00 |
age’’ | 0.4633 | 0.4725 | 0.3518 | -0.2844 | 1.1253 | 0.9033 | 0.99 |
day | 0.7006 | 0.7006 | 0.0169 | 0.6686 | 0.7342 | 1.0000 | 1.03 |
day’ | -1.2157 | -1.2152 | 0.1052 | -1.4126 | -1.0030 | 0.0000 | 1.03 |
day’’ | 1.9167 | 1.9182 | 0.3240 | 1.2849 | 2.5505 | 1.0000 | 0.99 |
day’’’ | -0.0939 | -0.0946 | 0.4313 | -0.9805 | 0.7162 | 0.4085 | 1.04 |
treatment=Vitamin D | 0.2586 | 0.2582 | 0.2553 | -0.2296 | 0.7553 | 0.8388 | 1.01 |
day:y≥In Hospital/Facility | -0.0002 | -0.0002 | 0.0003 | -0.0009 | 0.0004 | 0.2255 | 0.98 |
treatment=Vitamin D:y≥In Hospital/Facility | 0.0001 | 0.0002 | 0.0053 | -0.0103 | 0.0104 | 0.5102 | 1.02 |
day:y≥Home | 0.0004 | 0.0004 | 0.0003 | -0.0002 | 0.0011 | 0.8930 | 1.01 |
treatment=Vitamin D:y≥Home | 0.0002 | 0.0002 | 0.0054 | -0.0104 | 0.0109 | 0.5125 | 1.01 |
Plot posterior density and Compute posterior mean, median, and 0.95 highest posterior density interval for the treatment effect on each category of the response or better.
<- PostF(bncppo, name='orig')
P P(`treatment=Vitamin D` > 0)
[1] 0.8388333
plot(P(`treatment=Vitamin D`))
P(`treatment=Vitamin D` + `treatment=Vitamin D:y>=In Hospital/Facility` > 0)
[1] 0.8385
plot(P(`treatment=Vitamin D` + `treatment=Vitamin D:y>=In Hospital/Facility`))
P(`treatment=Vitamin D` + `treatment=Vitamin D:y>=Home` > 0)
[1] 0.839
plot(P(`treatment=Vitamin D` + `treatment=Vitamin D:y>=Home`))
Bayes Constrained Partial Proportional Odds Model
The unconstrained partial PO model above allows the treatment effect to be different for every level of the outcome. Here we use the constrained partial PO model to allow treatment to have a special effect on death but to have the same effect on all other endpoints in terms of odds computed on cumulative probabilities. This will allow us to compute the posterior probability that the treatment affects mortality differently than it affects the other outcomes. Unlike above, we do not at the point place skepticism on the differential effect for death. No interaction with time is included in the following model.
The constrained partial PO model’s constraint function cppo
never evaluates the departure from PO at the first level of the outcome variable. That is because the probability that Y is the lowest outcome is one minus the probability that it is at or above the second level. So to model a special effect of treatment on death we need to reverse the order of levels in Y.
<- transform(d,
drev status = factor(status, rev(levels(status))))
<- blrm(status ~ bstatus + rcs(age, 4) + rcs(day, 5) +
bdcppor + cluster(id),
treatment ~ day + treatment, # PPO part of model
cppo=function(y) y == 'Dead',
seed=17,
data=drev, loo=FALSE, iter=2000, refresh=100,
file='bdcppor.rds')
stanDx(bdcppor)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
n_eff Rhat
y>=In Hospital/Facility 97 1.062
y>=Vent/ARDS 150 1.044
y>=Dead 282 1.018
bstatus=On Vent 69 1.081
bstatus=In Hospital 103 1.019
age 127 1.024
age' 127 1.031
age'' 107 1.038
day 2140 1.000
day' 3439 0.999
day'' 4284 1.001
day''' 4012 1.000
treatment=Vitamin D 90 1.032
day x f(y) 234 1.009
treatment=Vitamin D x f(y) 2532 1.000
sigmag 2646 1.000
stanDxplot(bdcppor)
bdcppor
Bayesian Constrained Partial Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.455 for Intercepts
blrm(formula = status ~ bstatus + rcs(age, 4) + rcs(day, 5) + treatment + cluster(id), ppo = ~day + treatment, cppo = function(y) y == "Dead", data = drev, iter = 2000, refresh = 100, loo = FALSE, file = "bdcppor.rds", seed = 17)
Frequencies of Responses
Home In Hospital/Facility Vent/ARDS 19586 11128 3343 Dead 196Frequencies of Missing Values Due to Each Variable
status bstatus age day treatment cluster(id) 50 28 0 0 0 0
Mixed Calibration/ Discrimination Indexes |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 34253 | B 0.235 [0.226, 0.242] | g 4.223 [4.008, 4.456] | C 0.773 [0.77, 0.776] |
Draws 4000 | gp 0.454 [0.446, 0.46] | Dxy 0.547 [0.54, 0.551] | |
Chains 4 | EV 0.672 [0.634, 0.696] | ||
p 10 | v 14.068 [12.632, 15.621] | ||
Cluster on id
|
vp 0.168 [0.159, 0.174] | ||
Clusters 1351 | |||
σγ 4.6924 [4.4862, 4.9008] |
Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
---|---|---|---|---|---|---|---|
y≥In Hospital/Facility | 10.6406 | 10.6218 | 1.1681 | 8.3114 | 12.9286 | 1.0000 | 1.01 |
y≥Vent/ARDS | 2.8439 | 2.8270 | 1.1679 | 0.6342 | 5.2562 | 0.9942 | 0.99 |
y≥Dead | -8.4408 | -8.4403 | 1.1841 | -10.7170 | -6.0848 | 0.0000 | 0.97 |
bstatus=On Vent | -1.9087 | -1.9044 | 0.5141 | -2.8471 | -0.8882 | 0.0000 | 1.00 |
bstatus=In Hospital | -6.6774 | -6.6767 | 0.5129 | -7.6070 | -5.6711 | 0.0000 | 1.03 |
age | 0.0090 | 0.0091 | 0.0285 | -0.0467 | 0.0623 | 0.6205 | 1.01 |
age’ | 0.1081 | 0.1062 | 0.0682 | -0.0290 | 0.2389 | 0.9412 | 1.01 |
age’’ | -0.5216 | -0.5209 | 0.3709 | -1.2376 | 0.1963 | 0.0792 | 1.00 |
day | -0.7769 | -0.7769 | 0.0180 | -0.8105 | -0.7395 | 0.0000 | 0.95 |
day’ | 1.4323 | 1.4322 | 0.1080 | 1.2128 | 1.6395 | 1.0000 | 0.98 |
day’’ | -2.3983 | -2.4030 | 0.3309 | -3.0452 | -1.7413 | 0.0000 | 1.03 |
day’’’ | 0.4463 | 0.4604 | 0.4394 | -0.3979 | 1.3320 | 0.8490 | 0.99 |
treatment=Vitamin D | -0.3062 | -0.3047 | 0.2607 | -0.8068 | 0.1824 | 0.1232 | 1.00 |
day x f(y) | 0.0232 | 0.0232 | 0.0008 | 0.0217 | 0.0247 | 1.0000 | 0.94 |
treatment=Vitamin D x f(y) | 0.0323 | 0.0324 | 0.0176 | -0.0008 | 0.0677 | 0.9702 | 1.02 |
The probability that vitamin D affects death differently than how it affects the non-fatal endpoints is 0.97. Let’s estimate the treatment effect separately for nonfatal and fatal outcomes, in that order.
contrast(bdcppor, list(treatment='Vitamin D'),
list(treatment='Placebo'),
ycut='In Hospital/Facility')
bstatus age day Contrast S.E.
1 status=In Hospital/Facility In Hospital 58 14.5 -0.308613 0.2607144
Lower Upper Pr(Contrast>0)
1 status=In Hospital/Facility -0.8151987 0.1747911 0.1212
Intervals are 0.95 highest posterior density intervals
Contrast is the posterior mean
# Note: ycut='Vent/ARDS' would have given the same result as non-PO did
# not kick in until the last outcome level
contrast(bdcppor, list(treatment='Vitamin D'),
list(treatment='Placebo'),
ycut='Dead')
bstatus age day Contrast S.E. Lower Upper
1 status=Dead In Hospital 58 14.5 0.1194458 0.3425427 -0.558436 0.7623403
Pr(Contrast>0)
1 status=Dead 0.6368
Intervals are 0.95 highest posterior density intervals
Contrast is the posterior mean
The probability of benefit of vitamin D on nonfatal outcomes is 1 - 0.12 = 0.88. From the same model the probability of increased mortality is 0.64.
Bayes Binary Longitudinal Model for Death Alone
In what follows we dichotomize the ordinal outcome at death vs. alive and fit a Bayesian longitudinal binary model allowing for a time x treatment interaction, after first fitting a model with no interaction. Because of a lower effective sample size with a binary event, the model is simplified.
$dead <- 1 * (d$status == 'Dead')
d<- blrm(dead ~ bstatus + rcs(age, 3) + treatment +
bdeadnoia cluster(id),
seed=13,
data=d, loo=FALSE, iter=2000, refresh=100,
file='bdeadnoia.rds')
stanDx(bdeadnoia)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
n_eff Rhat
Intercept 888 1.005
bstatus=On Vent 2296 1.000
bstatus=In Hospital 2030 1.003
age 2758 1.001
age' 3543 1.000
treatment=Vitamin D 3467 1.000
sigmag 651 1.007
stanDxplot(bdeadnoia, 'ALL')
bdeadnoia
Bayesian Logistic Model
Dirichlet Priors With Concentration Parameter 0.541 for Intercepts
blrm(formula = dead ~ bstatus + rcs(age, 3) + treatment + cluster(id), data = d, iter = 2000, refresh = 100, loo = FALSE, file = "bdeadnoia.rds", seed = 13)Frequencies of Missing Values Due to Each Variable
dead bstatus age treatment cluster(id) 50 28 0 0 0
Mixed Calibration/ Discrimination Indexes |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 34253 | B 0.006 [0.006, 0.006] | g 1.077 [0.821, 1.363] | C 0.671 [0.663, 0.676] |
0 34057 | gp 0.002 [0.001, 0.003] | Dxy 0.342 [0.326, 0.351] | |
1 196 | EV 0.003 [0.001, 0.008] | ||
Draws 4000 | v 0.948 [0.529, 1.465] | ||
Chains 4 | vp 0 [0, 0] | ||
p 5 | |||
Cluster on id
|
|||
Clusters 1351 | |||
σγ 2.4389 [1.9401, 2.9362] |
Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
---|---|---|---|---|---|---|---|
Intercept | -7.9477 | -7.9004 | 1.0097 | -10.1054 | -6.0666 | 0.0000 | 0.87 |
bstatus=On Vent | -1.1942 | -1.1887 | 0.4094 | -1.9915 | -0.3906 | 0.0018 | 0.96 |
bstatus=In Hospital | -2.1443 | -2.1351 | 0.3984 | -2.9104 | -1.3792 | 0.0000 | 0.95 |
age | 0.0471 | 0.0467 | 0.0192 | 0.0099 | 0.0848 | 0.9945 | 1.10 |
age’ | -0.0051 | -0.0052 | 0.0175 | -0.0378 | 0.0300 | 0.3902 | 1.03 |
treatment=Vitamin D | 0.2355 | 0.2314 | 0.2393 | -0.2283 | 0.6927 | 0.8360 | 1.03 |
The posterior probability that vitamin D increases mortality is 0.836 with this binary logistic model.
Now with the time interaction included.
$dead <- 1 * (d$status == 'Dead')
d<- blrm(dead ~ bstatus + rcs(age, 3) + treatment * rcs(day, 3) +
bdead cluster(id),
seed=13,
data=d, loo=FALSE, iter=2000, refresh=100,
file='bdead.rds')
stanDx(bdead)
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)
n_eff Rhat
Intercept 21 1.138
bstatus=On Vent 244 1.026
bstatus=In Hospital 71 1.065
age 204 1.034
age' 2239 1.000
treatment=Vitamin D 801 1.005
day 40 1.067
day' 326 1.021
treatment=Vitamin D * day 1272 1.002
treatment=Vitamin D * day' 1737 1.001
sigmag 16 1.155
stanDxplot(bdead, 'ALL')
bdead
Bayesian Logistic Model
Dirichlet Priors With Concentration Parameter 0.541 for Intercepts
blrm(formula = dead ~ bstatus + rcs(age, 3) + treatment * rcs(day, 3) + cluster(id), data = d, iter = 2000, refresh = 100, loo = FALSE, file = "bdead.rds", seed = 13)Frequencies of Missing Values Due to Each Variable
dead bstatus age treatment day cluster(id) 50 28 0 0 0 0
Mixed Calibration/ Discrimination Indexes |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 34253 | B 0.006 [0.006, 0.006] | g 0.978 [0.721, 1.135] | C 0.708 [0.691, 0.717] |
0 34057 | gp 0.004 [0.002, 0.006] | Dxy 0.415 [0.382, 0.434] | |
1 196 | EV 0.005 [0.002, 0.008] | ||
Draws 4000 | v 0.763 [0.419, 1.028] | ||
Chains 4 | vp 0 [0, 0] | ||
p 9 | |||
Cluster on id
|
|||
Clusters 1351 | |||
σγ 1.0546 [0.0035, 1.9368] |
Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
---|---|---|---|---|---|---|---|
Intercept | -5.7168 | -5.6153 | 0.9047 | -7.5299 | -4.0436 | 0.0000 | 0.75 |
bstatus=On Vent | -0.8715 | -0.8626 | 0.2965 | -1.4289 | -0.3046 | 0.0003 | 0.94 |
bstatus=In Hospital | -1.5011 | -1.4684 | 0.3483 | -2.2361 | -0.8965 | 0.0000 | 0.78 |
age | 0.0353 | 0.0346 | 0.0139 | 0.0099 | 0.0644 | 0.9960 | 1.15 |
age’ | -0.0052 | -0.0054 | 0.0121 | -0.0289 | 0.0184 | 0.3295 | 0.97 |
treatment=Vitamin D | -0.0892 | -0.0889 | 0.3484 | -0.7412 | 0.6013 | 0.4008 | 1.04 |
day | -0.0795 | -0.0818 | 0.0408 | -0.1624 | -0.0030 | 0.0345 | 1.11 |
day’ | 0.0206 | 0.0214 | 0.0551 | -0.0859 | 0.1308 | 0.6580 | 0.99 |
treatment=Vitamin D × day | 0.0272 | 0.0267 | 0.0458 | -0.0598 | 0.1167 | 0.7295 | 1.02 |
treatment=Vitamin D × day’ | -0.0059 | -0.0050 | 0.0693 | -0.1387 | 0.1305 | 0.4722 | 0.97 |
ggplot(Predict(bdead, day, treatment))
Compute treatment contrasts (log odds ratios) as a function of time.
<- contrast(bdead, list(treatment='Vitamin D', day=1:28),
k list(treatment='Placebo', day=1:28))
k
bstatus age day Contrast S.E. Lower Upper
1 In Hospital 58 1 -0.061995077 0.3118912 -0.63051747 0.5679513
2 In Hospital 58 2 -0.034768495 0.2781012 -0.55512603 0.5300278
3* In Hospital 58 3 -0.007541913 0.2481887 -0.49214380 0.4909035
4 In Hospital 58 4 0.019673499 0.2237445 -0.39217686 0.4911184
5* In Hospital 58 5 0.046821894 0.2066724 -0.33140155 0.4760852
6* In Hospital 58 6 0.073836253 0.1982867 -0.29800301 0.4649681
7* In Hospital 58 7 0.100649558 0.1985063 -0.28120825 0.4895067
8* In Hospital 58 8 0.127194793 0.2056764 -0.27276669 0.5295627
9* In Hospital 58 9 0.153404938 0.2172100 -0.27716379 0.5710971
10* In Hospital 58 10 0.179212976 0.2304379 -0.25741483 0.6293617
11* In Hospital 58 11 0.204551890 0.2430959 -0.24757546 0.6959855
12* In Hospital 58 12 0.229354662 0.2534465 -0.24426898 0.7420478
13* In Hospital 58 13 0.253554273 0.2602500 -0.24087243 0.7671111
14* In Hospital 58 14 0.277083706 0.2627400 -0.21469462 0.8072200
15* In Hospital 58 15 0.299897351 0.2608493 -0.18826472 0.8311799
16* In Hospital 58 16 0.322035234 0.2557943 -0.14843419 0.8565006
17* In Hospital 58 17 0.343558787 0.2495172 -0.13938872 0.8458842
18* In Hospital 58 18 0.364529443 0.2444885 -0.08721785 0.8703701
19* In Hospital 58 19 0.385008635 0.2435442 -0.07064278 0.8740898
20* In Hospital 58 20 0.405057796 0.2494226 -0.10760083 0.8651527
21* In Hospital 58 21 0.424738359 0.2640553 -0.08014772 0.9482028
22* In Hospital 58 22 0.444111757 0.2880380 -0.12805692 0.9852111
23* In Hospital 58 23 0.463239424 0.3206903 -0.16041240 1.0776938
24* In Hospital 58 24 0.482182791 0.3605736 -0.16637156 1.2367042
25* In Hospital 58 25 0.501003292 0.4060202 -0.22888248 1.3578278
26* In Hospital 58 26 0.519762361 0.4554347 -0.29543608 1.4792000
27* In Hospital 58 27 0.538511190 0.5074826 -0.38170386 1.5979949
28* In Hospital 58 28 0.557260020 0.5613802 -0.45143753 1.7464363
Pr(Contrast>0)
1 0.4185
2 0.4440
3* 0.4815
4 0.5298
5* 0.5862
6* 0.6480
7* 0.6900
8* 0.7255
9* 0.7615
10* 0.7805
11* 0.7972
12* 0.8140
13* 0.8390
14* 0.8578
15* 0.8835
16* 0.9015
17* 0.9200
18* 0.9365
19* 0.9448
20* 0.9500
21* 0.9510
22* 0.9412
23* 0.9275
24* 0.9085
25* 0.8948
26* 0.8718
27* 0.8548
28* 0.8393
Redundant contrasts are denoted by *
Intervals are 0.95 highest posterior density intervals
Contrast is the posterior mean
GEE Model for Death with Vitamin D Interaction
<- lrm(dead ~ bstatus + rcs(age, 3) + rcs(day, 3) + rcs(vitD, 3) * treatment,
f data=d, x=TRUE, y=TRUE)
<- robcov(f, d$id)
g anova(g)
Wald Statistics for dead
|
|||
χ2 | d.f. | P | |
---|---|---|---|
bstatus | 30.73 | 2 | <0.0001 |
age | 34.31 | 2 | <0.0001 |
Nonlinear | 0.85 | 1 | 0.3577 |
day | 49.58 | 2 | <0.0001 |
Nonlinear | 1.02 | 1 | 0.3125 |
vitD (Factor+Higher Order Factors) | 19.30 | 4 | 0.0007 |
All Interactions | 4.86 | 2 | 0.0882 |
Nonlinear (Factor+Higher Order Factors) | 2.17 | 2 | 0.3387 |
treatment (Factor+Higher Order Factors) | 5.79 | 3 | 0.1221 |
All Interactions | 4.86 | 2 | 0.0882 |
vitD × treatment (Factor+Higher Order Factors) | 4.86 | 2 | 0.0882 |
Nonlinear | 0.05 | 1 | 0.8157 |
Nonlinear Interaction : f(A,B) vs. AB | 0.05 | 1 | 0.8157 |
TOTAL NONLINEAR | 4.00 | 4 | 0.4054 |
TOTAL NONLINEAR + INTERACTION | 12.28 | 5 | 0.0311 |
TOTAL | 132.51 | 11 | <0.0001 |
There is mild evidence for an interaction. Look at the contrasts.
<- contrast(g, list(treatment='Vitamin D', vitD=0:65),
k list(treatment='Placebo', vitD=0:65))
<- as.data.frame(k[c('vitD', 'Contrast', 'Lower', 'Upper')])
k ggplot(k, aes(x=vitD, y=Contrast)) + geom_point() +
geom_errorbar(aes(ymin=Lower, ymax=Upper), width=0)
Here higher contrasts (higher relative log odds) is bad. Vitamin D treatment seems to move mortality in the wrong direction for patients with low baseline vitamin D, if anything.
Serial Conditional Markov Ordinal Model
Here we use a first order Markov process where the PO model on a given day is conditional on the ordinal outcome on the previous day as well as on the other variables. First we need to create a 1-day lagged outcome status. Unfortunately, since vent and ARDS were pooled for the follow-up status we need to pool them for the baseline status, resulting in some loss of information. Same for in hospital vs. being at another healthcare facility.
<- dnc
d <- d$bstatus
s levels(s) <- list('Vent/ARDS'='ARDS', 'Vent/ARDS'='On Vent', 'In Hospital/Facility'='In Hospital')
table(d$bstatus, s)
s
Vent/ARDS In Hospital/Facility
ARDS 2130 0
On Vent 9134 0
In Hospital 0 23011
# Now expand baseline levels to include 1 other levels occurring in follow-up
<- factor(s, setdiff(levels(d$status), 'Dead'))
s table(d$bstatus, s)
s
Vent/ARDS In Hospital/Facility Home
ARDS 2130 0 0
On Vent 9134 0 0
In Hospital 0 23011 0
$bstatus <- s
drequire(data.table)
setDT(d)
<- d[order(id, day), ]
d := shift(status), by=id]
d[, pstatus == 1, pstatus := bstatus]
d[day $pstatus <- factor(d$pstatus, setdiff(levels(d$pstatus), 'Dead'))
d# View(d[, c('id', 'day', 'bstatus', 'status', 'pstatus')])
Fit a fairly saturated model that allows
- transitions to depend on a flexible function of time
- the effect of the previous state to change with time
- the treatment effect to change with time
- a weird effect where the effect of treatment may differ according to the previous outcome state
<- datadist(d); options(datadist='dd')
dd <- lrm(status ~ rcs(day, 4) * (pstatus + treatment + rcs(age, 3) + lips + charlson + sofa) +
f * treatment,
pstatus data=d, maxit=30)
f
Logistic Regression Model
lrm(formula = status ~ rcs(day, 4) * (pstatus + treatment + rcs(age, 3) + lips + charlson + sofa) + pstatus * treatment, data = d, maxit = 30)
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 189 3169 10862 Home 18988Frequencies of Missing Values Due to Each Variable
status day pstatus treatment age lips charlson sofa 50 0 49 0 0 0 1045 0
Model Likelihood Ratio Test |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 33208 | LR χ2 49509.96 | R2 0.915 | C 0.982 |
max |∂log L/∂β| 1×10-7 | d.f. 37 | g 5.850 | Dxy 0.963 |
Pr(>χ2) <0.0001 | gr 347.347 | γ 0.965 | |
gp 0.174 | τa 0.537 | ||
Brier 0.014 |
β | S.E. | Wald Z | Pr(>|Z|) | |
---|---|---|---|---|
y≥Vent/ARDS | 4.9453 | 0.4358 | 11.35 | <0.0001 |
y≥In Hospital/Facility | -0.1497 | 0.4254 | -0.35 | 0.7249 |
y≥Home | -6.8416 | 0.4313 | -15.86 | <0.0001 |
day | -0.1961 | 0.0883 | -2.22 | 0.0264 |
day’ | 0.0921 | 0.3535 | 0.26 | 0.7945 |
day’’ | 0.5582 | 1.0853 | 0.51 | 0.6071 |
pstatus=In Hospital/Facility | 5.4912 | 0.1630 | 33.70 | <0.0001 |
pstatus=Home | 16.4244 | 2.2372 | 7.34 | <0.0001 |
treatment=Vitamin D | 0.2098 | 0.1405 | 1.49 | 0.1353 |
age | -0.0019 | 0.0093 | -0.20 | 0.8416 |
age’ | 0.0013 | 0.0093 | 0.14 | 0.8908 |
lips | -0.0846 | 0.0213 | -3.98 | <0.0001 |
charlson | -0.0777 | 0.0294 | -2.64 | 0.0083 |
sofa | -0.0982 | 0.0199 | -4.95 | <0.0001 |
day × pstatus=In Hospital/Facility | 0.1522 | 0.0286 | 5.31 | <0.0001 |
day’ × pstatus=In Hospital/Facility | -0.2329 | 0.1137 | -2.05 | 0.0406 |
day’’ × pstatus=In Hospital/Facility | 0.0035 | 0.3484 | 0.01 | 0.9920 |
day × pstatus=Home | -0.1601 | 0.2875 | -0.56 | 0.5777 |
day’ × pstatus=Home | 0.7500 | 0.6896 | 1.09 | 0.2768 |
day’’ × pstatus=Home | -2.5635 | 1.7128 | -1.50 | 0.1345 |
day × treatment=Vitamin D | 0.0012 | 0.0264 | 0.05 | 0.9626 |
day’ × treatment=Vitamin D | -0.0461 | 0.1040 | -0.44 | 0.6579 |
day’’ × treatment=Vitamin D | 0.1391 | 0.3179 | 0.44 | 0.6616 |
day × age | 0.0009 | 0.0019 | 0.47 | 0.6373 |
day’ × age | -0.0042 | 0.0076 | -0.55 | 0.5826 |
day’’ × age | 0.0118 | 0.0233 | 0.51 | 0.6127 |
day × age’ | -0.0021 | 0.0019 | -1.15 | 0.2505 |
day’ × age’ | 0.0100 | 0.0073 | 1.37 | 0.1706 |
day’’ × age’ | -0.0294 | 0.0220 | -1.33 | 0.1827 |
day × lips | 0.0111 | 0.0043 | 2.55 | 0.0109 |
day’ × lips | -0.0225 | 0.0171 | -1.32 | 0.1877 |
day’’ × lips | 0.0424 | 0.0519 | 0.82 | 0.4141 |
day × charlson | 0.0025 | 0.0058 | 0.42 | 0.6722 |
day’ × charlson | -0.0049 | 0.0230 | -0.22 | 0.8296 |
day’’ × charlson | 0.0177 | 0.0703 | 0.25 | 0.8015 |
day × sofa | 0.0092 | 0.0039 | 2.35 | 0.0188 |
day’ × sofa | -0.0164 | 0.0151 | -1.08 | 0.2786 |
day’’ × sofa | 0.0374 | 0.0458 | 0.82 | 0.4142 |
pstatus=In Hospital/Facility × treatment=Vitamin D | -0.1186 | 0.1077 | -1.10 | 0.2707 |
pstatus=Home × treatment=Vitamin D | -0.4551 | 0.3974 | -1.15 | 0.2521 |
anova(f)
Wald Statistics for status
|
|||
χ2 | d.f. | P | |
---|---|---|---|
day (Factor+Higher Order Factors) | 275.08 | 27 | <0.0001 |
All Interactions | 158.49 | 24 | <0.0001 |
Nonlinear (Factor+Higher Order Factors) | 208.08 | 18 | <0.0001 |
pstatus (Factor+Higher Order Factors) | 6515.10 | 10 | <0.0001 |
All Interactions | 119.70 | 8 | <0.0001 |
treatment (Factor+Higher Order Factors) | 6.99 | 6 | 0.3215 |
All Interactions | 4.69 | 5 | 0.4553 |
age (Factor+Higher Order Factors) | 9.89 | 8 | 0.2727 |
All Interactions | 3.54 | 6 | 0.7381 |
Nonlinear (Factor+Higher Order Factors) | 4.33 | 4 | 0.3638 |
lips (Factor+Higher Order Factors) | 20.47 | 4 | 0.0004 |
All Interactions | 11.66 | 3 | 0.0086 |
charlson (Factor+Higher Order Factors) | 28.81 | 4 | <0.0001 |
All Interactions | 1.43 | 3 | 0.6982 |
sofa (Factor+Higher Order Factors) | 41.28 | 4 | <0.0001 |
All Interactions | 16.47 | 3 | 0.0009 |
day × pstatus (Factor+Higher Order Factors) | 117.94 | 6 | <0.0001 |
Nonlinear | 115.35 | 4 | <0.0001 |
Nonlinear Interaction : f(A,B) vs. AB | 115.35 | 4 | <0.0001 |
day × treatment (Factor+Higher Order Factors) | 2.03 | 3 | 0.5654 |
Nonlinear | 0.20 | 2 | 0.9065 |
Nonlinear Interaction : f(A,B) vs. AB | 0.20 | 2 | 0.9065 |
day × age (Factor+Higher Order Factors) | 3.54 | 6 | 0.7381 |
Nonlinear | 3.53 | 5 | 0.6195 |
Nonlinear Interaction : f(A,B) vs. AB | 3.53 | 5 | 0.6195 |
f(A,B) vs. Af(B) + Bg(A) | 1.88 | 2 | 0.3901 |
Nonlinear Interaction in day vs. Af(B) | 3.28 | 4 | 0.5129 |
Nonlinear Interaction in age vs. Bg(A) | 2.10 | 3 | 0.5512 |
day × lips (Factor+Higher Order Factors) | 11.66 | 3 | 0.0086 |
Nonlinear | 8.19 | 2 | 0.0166 |
Nonlinear Interaction : f(A,B) vs. AB | 8.19 | 2 | 0.0166 |
day × charlson (Factor+Higher Order Factors) | 1.43 | 3 | 0.6982 |
Nonlinear | 0.09 | 2 | 0.9559 |
Nonlinear Interaction : f(A,B) vs. AB | 0.09 | 2 | 0.9559 |
day × sofa (Factor+Higher Order Factors) | 16.47 | 3 | 0.0009 |
Nonlinear | 2.91 | 2 | 0.2330 |
Nonlinear Interaction : f(A,B) vs. AB | 2.91 | 2 | 0.2330 |
pstatus × treatment (Factor+Higher Order Factors) | 2.14 | 2 | 0.3436 |
TOTAL NONLINEAR | 210.40 | 20 | <0.0001 |
TOTAL INTERACTION | 160.60 | 26 | <0.0001 |
TOTAL NONLINEAR + INTERACTION | 237.87 | 29 | <0.0001 |
TOTAL | 6776.51 | 37 | <0.0001 |
ggplot(Predict(f, day, pstatus))
There was little evidence for a treatment effect at any time, and little evidence that the effect of treatment on state transitions varied with time. There is also little evidence that the treatment effect differed for different previous states.
The graph above shows that being at home is almost an absorbing state, and that the effect of previous state varies with time. Some of that may be due to some odd data: there are 57 patients on ventilator on day 27 but not on day 28.
For now let’s simplify the model by having previous state as the only factor interacting with time, and removing the interactions with treatment. Compare the variances of parameter estimates with cluster sandwich robust estimates.
<- lrm(status ~ rcs(day, 4) * pstatus + treatment + rcs(age, 3) + lips + charlson + sofa,
f data=d, maxit=30, x=TRUE, y=TRUE)
<- robcov(f, d$id)
g round(diag(vcov(f)) / diag(vcov(g)), 2)
y>=Vent/ARDS y>=In Hospital/Facility
0.86 0.92
y>=Home day
0.92 1.54
day' day''
1.83 1.82
pstatus=In Hospital/Facility pstatus=Home
0.95 0.70
treatment=Vitamin D age
1.07 0.95
age' lips
1.08 1.11
charlson sofa
1.10 1.10
day * pstatus=In Hospital/Facility day' * pstatus=In Hospital/Facility
1.42 1.55
day'' * pstatus=In Hospital/Facility day * pstatus=Home
1.58 0.70
day' * pstatus=Home day'' * pstatus=Home
0.75 0.80
The two variance estimates are similar, giving an indirect indication that the assumption of conditional independence (conditioning on previous state) is reasonable.
Repeat the model truncating the data on day 27 to see how much of the time x previous state interaction resulted from the drop in number of patients on ventilator on day 28.
<- lrm(status ~ rcs(day, 4) * pstatus + treatment + rcs(age, 3) + lips + charlson + sofa,
fm data=d, maxit=30, subset=day < 28)
fm
Logistic Regression Model
lrm(formula = status ~ rcs(day, 4) * pstatus + treatment + rcs(age, 3) + lips + charlson + sofa, data = d, subset = day < 28, maxit = 30)
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 188 3169 10643 Home 18088Frequencies of Missing Values Due to Each Variable
status day pstatus treatment age lips charlson sofa 48 0 47 0 0 0 1010 0
Model Likelihood Ratio Test |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 32088 | LR χ2 48589.20 | R2 0.918 | C 0.982 |
max |∂log L/∂β| 8×10-6 | d.f. 17 | g 5.940 | Dxy 0.964 |
Pr(>χ2) <0.0001 | gr 380.041 | γ 0.966 | |
gp 0.180 | τa 0.542 | ||
Brier 0.014 |
β | S.E. | Wald Z | Pr(>|Z|) | |
---|---|---|---|---|
y≥Vent/ARDS | 4.3659 | 0.2183 | 20.00 | <0.0001 |
y≥In Hospital/Facility | -0.7502 | 0.1975 | -3.80 | 0.0001 |
y≥Home | -7.6241 | 0.2151 | -35.44 | <0.0001 |
day | -0.0990 | 0.0238 | -4.15 | <0.0001 |
day’ | 0.1121 | 0.1026 | 1.09 | 0.2744 |
day’’ | -0.1678 | 0.2674 | -0.63 | 0.5304 |
pstatus=In Hospital/Facility | 5.5857 | 0.1560 | 35.80 | <0.0001 |
pstatus=Home | 16.6641 | 2.4397 | 6.83 | <0.0001 |
treatment=Vitamin D | 0.0693 | 0.0518 | 1.34 | 0.1804 |
age | 0.0005 | 0.0038 | 0.14 | 0.8857 |
age’ | -0.0052 | 0.0037 | -1.44 | 0.1511 |
lips | -0.0245 | 0.0086 | -2.85 | 0.0044 |
charlson | -0.0651 | 0.0116 | -5.61 | <0.0001 |
sofa | -0.0386 | 0.0077 | -5.02 | <0.0001 |
day × pstatus=In Hospital/Facility | 0.1924 | 0.0300 | 6.41 | <0.0001 |
day’ × pstatus=In Hospital/Facility | -0.6227 | 0.1272 | -4.90 | <0.0001 |
day’’ × pstatus=In Hospital/Facility | 1.3921 | 0.3311 | 4.21 | <0.0001 |
day × pstatus=Home | -0.1821 | 0.3272 | -0.56 | 0.5779 |
day’ × pstatus=Home | 0.5840 | 0.8252 | 0.71 | 0.4791 |
day’’ × pstatus=Home | -1.3104 | 1.7679 | -0.74 | 0.4586 |
anova(fm)
Wald Statistics for status
|
|||
χ2 | d.f. | P | |
---|---|---|---|
day (Factor+Higher Order Factors) | 183.36 | 9 | <0.0001 |
All Interactions | 47.89 | 6 | <0.0001 |
Nonlinear (Factor+Higher Order Factors) | 60.34 | 6 | <0.0001 |
pstatus (Factor+Higher Order Factors) | 6266.67 | 8 | <0.0001 |
All Interactions | 47.89 | 6 | <0.0001 |
treatment | 1.79 | 1 | 0.1804 |
age | 6.05 | 2 | 0.0486 |
Nonlinear | 2.06 | 1 | 0.1511 |
lips | 8.11 | 1 | 0.0044 |
charlson | 31.44 | 1 | <0.0001 |
sofa | 25.21 | 1 | <0.0001 |
day × pstatus (Factor+Higher Order Factors) | 47.89 | 6 | <0.0001 |
Nonlinear | 38.56 | 4 | <0.0001 |
Nonlinear Interaction : f(A,B) vs. AB | 38.56 | 4 | <0.0001 |
TOTAL NONLINEAR | 62.40 | 7 | <0.0001 |
TOTAL NONLINEAR + INTERACTION | 70.00 | 9 | <0.0001 |
TOTAL | 6448.81 | 17 | <0.0001 |
ggplot(Predict(fm, day, pstatus))
The interaction effect was cut in half but is still there. However, to put its strength into context, the time by previous status interaction explains only 0.008 of the total amount of outcome variation explained by the status + status x time interaction combination.
We see that being at home is less of an absorbing state for later times, and that the influence of previously being in Vent/ARDS on the current state lessens with time.
Computing Unconditional Estimates
One can uncondition the previous model to obtain unconditional treatment effects, although these will be covariate- and time-dependent. We can use simulation to approximate the result to any desired degree of accuracy. The fast exact approach will be presented later, but the simulation method allows one to estimate complex quantities that are functions of individual conditional distributions at a specific follow-up time as well as functions of the multivariate distribution. We specify covariate settings (here we use representative values each covariate, just for convenience, using a method described below) and vary the treatment assignment. Here are the steps involved:
- The initial state is always conditioned on, so is not marginalized out. Separate simulations can be done for each initial state. For now we just set the initial state to in hospital, the most prevalent state.
- Given this initial state, use the fitted model to get the conditional distribution of Y at (post-randomization) day 1 given the initial state, covariate settings, and treatment.
- Simulate outcomes for day 1 for 50,000 patients, for each treatment.
- For each of these patients, simulate their outcome for day 2 conditional on their simulated outcome for day 1.
- Likewise for days 3, …, 27.
- Death is an absorbing state so once death occurs all future status values are also set to death.
- For each patient compute the condition of interest from the vector of 27 Y values. Choices include the time until the patient was able to go home, the absolute risk of having an outcome \(Y \geq y\) at day 27, and the odds of the same.
- For a measure computed in the last step, for each treatment, compute a summary measure such as a mean or median.
- Take a difference or ratio of estimates computed in the previous step to summarize the treatment effect on the parameter of interest.
This approach does not factor in uncertainties in parameter estimates, so provides only a point estimate. A Bayesian approach would easily remedy this by computing the Bayesian posterior predictive distribution, i.e., by simulating each dataset for a different posterior sample of the model parameters. In the frequentist example that follows we ignore such uncertainties and compute the probability of being at home as a function of day and treatment, vitamin D:control odds ratio for being at home on days 1,2, …, 27, and the median time until being discharged home.
Note: The probability of death by a given day (these are cumulative probabilities since we treated death as an absorbing state) is lower for Vitamin D, in disagreement with the raw data. The estimates shown make the proportional odds assumption, which assumes consistency in treatment effects across cutoffs of outcome severity. As shown earlier, there is strong evidence against such consistency.
Start with computation of a set of representative covariate values that will be used when getting estimates related to transformed parameters. This is done by fitting an ordinal model for day 10 status alone, and finding covariate settings that give predicted values that are close to the unadjusted estimates for placebo. Of the qualifying settings, one that is subjectively closer to the center of the data distribution is chosen.
# Go back to the dataset that carried death forward, and analyze day 10
<- lrm(status ~ bstatus + treatment, data=dcf, subset=day==10) f10
<- predict(f10, data.frame(treatment='Placebo', bstatus='In Hospital'),
priskmarg type='fitted')['y>=Home']
priskmarg
y>=Home
0.6217469
# Compare with simple estimate not assuming effect of trt & bstatus are additive in log odds
<-
priskmargEmpirical with(subset(dcf, day == 10 & treatment == 'Placebo' & bstatus=='In Hospital'),
mean(status == 'Home', na.rm=TRUE))
# Now add covariates
<- lrm(status ~ bstatus + treatment + rcs(age, 3) + lips + charlson + sofa,
f10c data=dcf, subset=day==10)
<- subset(dcf, day==10, select=c(age, lips, charlson, sofa))
w <- cbind(w, treatment='Placebo', bstatus='In Hospital')
w <- predict(f10c, w, type='fitted')[, 'y>=Home']
prisk mean(prisk, na.rm=TRUE)
[1] 0.5769679
<- abs(prisk - priskmargEmpirical)
diff $diff <- diff
w<- ! is.na(prisk) & diff < 0.002
i sum(i)
[1] 14
setDT(w)
<- w[i, ]
wi <- wi[order(diff), ]
wi wi
age lips charlson sofa treatment bstatus diff
1: 55 7.0 3 5 Placebo In Hospital 6.198609e-05
2: 35 8.5 0 8 Placebo In Hospital 9.360643e-05
3: 31 4.0 6 4 Placebo In Hospital 1.416378e-04
4: 62 7.0 4 3 Placebo In Hospital 2.850394e-04
5: 68 5.5 2 5 Placebo In Hospital 3.229592e-04
6: 26 5.0 0 10 Placebo In Hospital 4.382999e-04
7: 66 4.0 5 3 Placebo In Hospital 4.643863e-04
8: 78 3.0 6 0 Placebo In Hospital 8.246585e-04
9: 29 8.0 2 6 Placebo In Hospital 8.843973e-04
10: 70 1.0 5 4 Placebo In Hospital 1.029117e-03
11: 70 8.0 4 1 Placebo In Hospital 1.511116e-03
12: 56 9.0 1 6 Placebo In Hospital 1.640537e-03
13: 41 6.0 4 5 Placebo In Hospital 1.847777e-03
14: 82 1.5 5 1 Placebo In Hospital 1.907995e-03
# Best match is close to the center
<- wi[1, 1:4]
repcov # For some reason when evaluating the result of Function(fit) with e.g. age=repcov$age
# the arithmetic was wrong. Need to remove data.frame/data.table and label attributes
<- lapply(repcov, function(x) {x <- unclass(x); attr(x, 'label') <- NULL; x})
repcov
# Compute median baseline covariates, using only the first record per patient
<- all.vars(~ age + lips + charlson + sofa)
co setDT(dcf)
<- dcf[day == 10, sapply(.SD, function(x) list(median = median(x, na.rm=TRUE))),
m =co]
.SDcols m
age.median lips.median charlson.median sofa.median
1: 58 5 3 5
Derive an R function that takes the comprehensive fitted model and computes the linear predictor except for the proportional odds model intercepts. Check the function with some examples.
<- Function(fm)
linpred # Get intercepts
<- coef(fm)[1:3]
alpha # Create a function that computes expit(linpred()) at representative covariate values
# for a given intercept
# This is the conditional probability that Y >= y given previous y = pstatus
<- function(y, pstatus, treatment, day)
pcond plogis(alpha[paste0('y>=', y)] + linpred(day=day, pstatus=pstatus, treatment=treatment,
age=repcov$age, lips=repcov$lips, charlson=repcov$charlson,
sofa=repcov$sofa))
# Test:
<- pcond('Home', 'In Hospital/Facility', treatment='Vitamin D', day=1:27)
pp round(pp, 3)
[1] 0.080 0.088 0.095 0.103 0.110 0.116 0.121 0.122 0.120 0.114 0.106 0.096
[13] 0.086 0.077 0.068 0.061 0.055 0.051 0.048 0.047 0.046 0.047 0.048 0.050
[25] 0.052 0.054 0.057
# Compare with proportions of same transitions ignoring covariates and treatment
<- numeric(27)
prop for(t in 1:27) {
<- with(d, day == t & pstatus == 'In Hospital/Facility')
i cat(sum(is.na(i)), sum(i, na.rm=TRUE), ' ')
<- mean(d$status[i] == 'Home', na.rm=TRUE)
prop[t] }
1 877 1 877 1 860 1 809 1 727 1 634 1 586 2 521 2 469 2 423 2 405 2 387 2 357 2 334 2 311 2 297 2 280 2 275 2 266 2 258 2 251 2 247 2 241 2 232 2 223 2 217 2 211
prn(list(prop, pp), fi='/tmp/z')
plot(prop, pp, xlab='Sample Proportion Hosp -> Home',
ylab='Covariate-Adjusted Model Estimate')
prn(0, fi='/tmp/z')
abline(a=0, b=1, col=gray(.85))
Simulate actual transitions for a large number of patients. Chained together this provides unconditional realizations of ordinal states.
# Sample 50,000 patients for each treatment and create outcomes for each day
<- 50000
N <- 'In Hospital/Facility'
initial.state <- unique(d$treatment)
tlev <- levels(d$status)
slev <- array(0L, dim=c(N, 27, 2), dimnames=list(NULL, paste('Day', 1 : 27), tlev))
Y if(alrsimdone) Y <- readRDS('alrsim.rds') else {
set.seed(1)
Sys.time()
for(i in 1 : N) { # takes about 28m
if(i %% 100 == 0) cat(i, file='/tmp/z')
for(trt in tlev) {
for(t in 1 : 27) {
<- if(t == 1) initial.state else slev[Y[i, t - 1, trt]]
ps if(ps == 'Dead') Y[i, t, trt] <- 1 # absorbing state
else {
# Get all 3 cumulative probabilities then 4 cell probabilities
<- pcond('Vent/ARDS', ps, treatment=trt, day=t)
p1 <- pcond('In Hospital/Facility', ps, treatment=trt, day=t)
p2 <- pcond('Home', ps, treatment=trt, day=t)
p3 <- c(1. - p1, p1 - p2, p2 - p3, p3)
cellprobs <- sample(1 : 4, 1, prob=cellprobs)
Y[i, t, trt]
}
}
}
}Sys.time()
saveRDS(Y, 'alrsim.rds')
}
Plot some summaries of the simulated data.
# Compute proportion at home as a function of time and treatment
<- apply(Y, 2:3, function(y) mean(y == 4))
x <- list(Placebo =list(Day=1:27, Proportion=x[, 'Placebo']),
phome 'Vitamin D'=list(Day=1:27, Proportion=x[, 'Vitamin D']))
labcurve(phome, pl=TRUE, ylab='Proportion at Home', col=c(1, 4))
apply(Y[, 27, ], 2, function(y) mean(y == 4))
Placebo Vitamin D
0.82284 0.84240
# Similarly for dead
<- apply(Y, 2:3, function(y) mean(y == 1))
x <- list(Placebo =list(Day=1:27, Proportion=x[, 'Placebo']),
pdead 'Vitamin D'=list(Day=1:27, Proportion=x[, 'Vitamin D']))
labcurve(pdead, pl=TRUE, ylab='Proportion Dead', col=c(1, 4))
apply(Y[, 27, ], 2, function(y) mean(y == 1))
Placebo Vitamin D
0.03748 0.03170
# For comparison compute proportion of patients who died
with(subset(d, day==1), tapply(ddeath < 29, treatment, mean))
Placebo Vitamin D
0.1304348 0.1591241
with(subset(dcf, day==27), tapply(status == 'Dead', treatment, mean, na.rm=TRUE))
Placebo Vitamin D
0.1308271 0.1576642
# Compute treatment odds ratios as a function of time
<- function(x) { p <- mean(x, na.rm=TRUE); p / (1. - p)}
odds
<- apply(Y[,, 'Vitamin D'] >= 2, 2 , odds) / apply(Y[,, 'Placebo'] >= 2, 2, odds)
or2 <- apply(Y[,, 'Vitamin D'] >= 3, 2 , odds) / apply(Y[,, 'Placebo'] >= 3, 2, odds)
or3 <- apply(Y[,, 'Vitamin D'] >= 4, 2 , odds) / apply(Y[,, 'Placebo'] >= 4, 2, odds)
or4
<- list('Y>=2 (alive)' = list(1:27, or2), 'Y>=3 (hospital/home)' = list(1:27, or3),
x 'Y=4 (home)' = list(1:27, or4))
labcurve(x, pl=TRUE, ylab='Odds Ratio Based on Cumulative Risk', xlab='Day', col=1:3)
abline(h=r <- exp(coef(fm)['treatment=Vitamin D']), col=gray(0.85))
text(27, r-0.02, 'Horizontal line is overall conditional\ntransition OR from unified PO model', adj=1)
# Compute median time to home, by treatment
<- function(y) if(all(y < 4)) Inf else min((1 : 27)[y == 4])
mth <- median(apply(Y[,, 'Placebo'], 1, mth))
meda <- median(apply(Y[,, 'Vitamin D'], 1, mth))
medb cat('Median days to home Placebo:', meda, ' Vitamin D:', medb, '\n')
Median days to home Placebo: 8 Vitamin D: 7
# You can almost get this from the cumulative "at home" curves presented above
# Not exactly equal since those curves don't stop at first at-home point
Compute Unconditional Probability Analytically
Here we compute \(\Pr(Y_s \geq y | X, Y_0)\) analytically for a single value of \((y, Y_0, X, s)\), using this recursive formula where \(k\) is the number of outcome states and \(t\) is time:
\[\Pr(Y_{t} = y | X) = \sum_{j=1}^{k} \Pr(Y_{t} = y | X, Y_{t-1} = j) \Pr(Y_{t-1} = j | X)\]
# Note: This is made obsolete by the new Hmisc function soprobMarkovOrd
# Function to compute all cell probabilities P(Y(t) = y | Y(t-1) = pstatus), y=1, 2, .., k
<- function(pstatus, treatment, day, ...) {
pconds <- length(alpha) + 1
k if(pstatus == 'Dead') return(c(1., rep(0., k - 1)))
<- linpred(day=day, pstatus=pstatus, treatment=treatment, ...)
xb <- plogis(alpha + xb)
cp c(1., cp) - c(cp, 0.)
}
# Compute unconditional probabilities
# Output is a matrix with s rows corresponding to t=1, 2, ..., s
# Each row has columns corresponding to the probabilities of being in each of the states
<- function(y0, s, treatment=treatment, ylevels=1 : k, ...) {
uprob <- length(alpha) + 1
k <- matrix(NA, nrow=s, ncol=k)
P colnames(P) <- as.character(ylevels)
1, ] <- pconds(y0, treatment, day=1, ...) # never uncondition on initial state
P[# Matrix of conditional probabilities of Y conditioning on previous time Y
# Columns = k conditional probabilities conditional on a single previous state
# Rows = all possible previous states
<- matrix(NA, nrow=k, ncol=k,
cp dimnames=list(paste('t-1', ylevels), paste('t', ylevels)))
for(t in 2 : s) {
for(j in 1 : k) cp[j, ] <- pconds(ylevels[j], treatment=treatment, day=t, ...)
# Compute unconditional probabilities of being in all possible states at current time t
<- t(cp) %*% P[t - 1, ]
P[t, ]
}
P
}
<- repcov
r <- Uhome <- list()
U for(trt in tlev) { # separately for Placebo, Vitamin D
<- uprob('In Hospital/Facility', 27, treatment=trt, ylevels=slev,
u age=r$age, lips=r$lips, charlson=r$charlson, sofa=r$sofa)
cat('\n', trt, '\n\n')
print(round(u, 3))
<- u
U[[trt]] <- list(day=1 : 27, p=u[, 4])
Uhome[[trt]] }
Placebo
Dead Vent/ARDS In Hospital/Facility Home
[1,] 0.000 0.012 0.912 0.075
[2,] 0.000 0.020 0.829 0.150
[3,] 0.001 0.025 0.750 0.224
[4,] 0.002 0.027 0.674 0.297
[5,] 0.003 0.028 0.602 0.366
[6,] 0.004 0.029 0.535 0.432
[7,] 0.005 0.028 0.474 0.492
[8,] 0.007 0.028 0.420 0.546
[9,] 0.008 0.027 0.373 0.592
[10,] 0.009 0.026 0.334 0.631
[11,] 0.011 0.025 0.301 0.663
[12,] 0.012 0.024 0.274 0.689
[13,] 0.014 0.024 0.252 0.710
[14,] 0.015 0.024 0.234 0.727
[15,] 0.017 0.024 0.219 0.741
[16,] 0.018 0.024 0.206 0.752
[17,] 0.020 0.024 0.194 0.761
[18,] 0.022 0.025 0.184 0.769
[19,] 0.023 0.025 0.175 0.777
[20,] 0.025 0.025 0.166 0.783
[21,] 0.027 0.025 0.158 0.789
[22,] 0.029 0.025 0.151 0.795
[23,] 0.031 0.025 0.144 0.800
[24,] 0.033 0.024 0.138 0.805
[25,] 0.034 0.024 0.131 0.811
[26,] 0.036 0.023 0.125 0.816
[27,] 0.038 0.022 0.120 0.821
Vitamin D
Dead Vent/ARDS In Hospital/Facility Home
[1,] 0.000 0.012 0.908 0.080
[2,] 0.000 0.019 0.821 0.160
[3,] 0.001 0.023 0.738 0.238
[4,] 0.002 0.025 0.659 0.314
[5,] 0.003 0.026 0.585 0.386
[6,] 0.004 0.026 0.516 0.454
[7,] 0.005 0.025 0.454 0.516
[8,] 0.006 0.025 0.399 0.571
[9,] 0.007 0.024 0.352 0.618
[10,] 0.008 0.023 0.313 0.657
[11,] 0.009 0.022 0.280 0.689
[12,] 0.010 0.021 0.254 0.714
[13,] 0.012 0.021 0.232 0.735
[14,] 0.013 0.021 0.215 0.752
[15,] 0.014 0.021 0.200 0.765
[16,] 0.015 0.021 0.188 0.776
[17,] 0.017 0.021 0.177 0.785
[18,] 0.018 0.021 0.167 0.793
[19,] 0.019 0.022 0.159 0.800
[20,] 0.021 0.022 0.151 0.806
[21,] 0.022 0.022 0.144 0.812
[22,] 0.024 0.022 0.137 0.818
[23,] 0.025 0.021 0.130 0.823
[24,] 0.027 0.021 0.124 0.828
[25,] 0.028 0.020 0.119 0.833
[26,] 0.030 0.019 0.113 0.838
[27,] 0.031 0.018 0.108 0.843
From the two matrices of unconditional probabilities one can see that the median time to home for placebo is 8 days and is 7 days for Vitamin D. Plot the unconditional probabilities of going home at day 1, 2, …, 27. Simulated probabilities from before are virtually identical to these estimates.
labcurve(Uhome, pl=TRUE, xlab='Day', ylab='Proportion at Home', col=c(1,4))
# labcurve(phome, add=TRUE) # simulated values are fully superimposed on exact values
Compare the 27d mortality estimates to the raw data using the same initial state patients.
== 'In Hospital' & ! is.na(bstatus) & day == 27,
dcf[bstatus mean(status == 'Dead', na.rm=TRUE), by=treatment]
treatment V1
1: Placebo 0.09133489
2: Vitamin D 0.12472160
# These proportions don't match. Let's use binary logistic regression to estimate the
# probability of death at the covariate settings used above
<- datadist(dcf); options(datadist='dd')
dd <- lrm(status == 'Dead' ~ treatment + rcs(age, 3) + lips +
f + sofa, data=subset(dcf, day == 27 & bstatus=='In Hospital'))
charlson <- repcov
r Predict(f, treatment, age=r$age, lips=r$lips,
charlson=r$charlson, sofa=r$sofa, fun=plogis)
treatment age lips charlson sofa yhat lower upper
1 Placebo 55 7 3 5 0.05645739 0.03495262 0.08995972
2 Vitamin D 55 7 3 5 0.07742578 0.05055463 0.11682228
Response variable (y):
Limits are 0.95 confidence limits
Use the exact calculations to get odds ratios and compare to the simulated odds ratios.
<- function(x) x / (1. - x)
odds <- odds(rowSums(U$'Vitamin D'[, 2:4])) / odds(rowSums(U$Placebo[, 2:4]))
or2 <- odds(rowSums(U$'Vitamin D'[, 3:4])) / odds(rowSums(U$Placebo[, 3:4]))
or3 <- odds(U$'Vitamin D'[, 4]) / odds(U$Placebo[, 4])
or4
<- list('Y>=2 (alive)' = list(1:27, or2), 'Y>=3 (hospital/home)' = list(1:27, or3),
x 'Y=4 (home)' = list(1:27, or4))
labcurve(x, pl=TRUE, ylab='Odds Ratio Based on Cumulative Risk', xlab='Day', col=1:3, ylim=c(1, 1.25))
abline(h=r <- exp(coef(fm)['treatment=Vitamin D']), col=gray(0.85))
text(27, r-0.02, 'Horizontal line is conditional\ntransition OR from unified PO model', adj=1)
The simulated ORs were unstable, especially for being alive, since the number of deaths is small.
Bayesian Markov Model
We parallel the regular proportional odds model with a Bayesian Markov proportional hazards model.
<- blrm(status ~ rcs(day, 4) * pstatus + treatment + rcs(age, 3) +
bmarkov + charlson + sofa,
lips data=d, subset=day < 28, refresh=100, file='bmarkov.rds')
htmlVerbatim(stanDx(bmarkov))
Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved
For each parameter, n_eff is a crude measure of effective sample size and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1)
n_eff Rhat y>=Vent/ARDS 2760 1.001 y>=In Hospital/Facility 2475 1.000 y>=Home 2615 1.001 day 2967 1.000 day' 3604 1.001 day'' 3245 1.000 pstatus=In Hospital/Facility 1864 1.001 pstatus=Home 1942 1.001 treatment=Vitamin D 7792 1.000 age 7440 0.999 age' 6085 0.999 lips 5704 0.999 charlson 7246 0.999 sofa 5856 1.000 day * pstatus=In Hospital/Facility 1957 1.000 day' * pstatus=In Hospital/Facility 2434 1.001 day'' * pstatus=In Hospital/Facility 3064 1.000 day * pstatus=Home 1659 1.000 day' * pstatus=Home 1712 1.001 day'' * pstatus=Home 2285 1.000
stanDxplot(bmarkov)
bmarkov
Bayesian Proportional Odds Ordinal Logistic Model
Dirichlet Priors With Concentration Parameter 0.455 for Intercepts
blrm(formula = status ~ rcs(day, 4) * pstatus + treatment + rcs(age, 3) + lips + charlson + sofa, data = d, subset = day < 28, refresh = 100, file = "bmarkov.rds")
Frequencies of Responses
Dead Vent/ARDS In Hospital/Facility 188 3169 10643 Home 18088Frequencies of Missing Values Due to Each Variable
status day pstatus treatment age lips charlson sofa 48 0 47 0 0 0 1010 0
Mixed Calibration/ Discrimination Indexes |
Discrimination Indexes |
Rank Discrim. Indexes |
|
---|---|---|---|
Obs 32088 | B 0.006 [0.006, 0.006] | g 5.828 [5.651, 6.022] | C 0.983 [0.983, 0.983] |
Draws 4000 | gp 0.01 [0.009, 0.012] | Dxy 0.966 [0.965, 0.966] | |
Chains 4 | EV 0.058 [0.05, 0.067] | ||
p 17 | v 29.995 [28.242, 31.731] | ||
vp 0 [0, 0] |
Mean β | Median β | S.E. | Lower | Upper | Pr(β>0) | Symmetry | |
---|---|---|---|---|---|---|---|
y≥Vent/ARDS | 4.3537 | 4.3480 | 0.2124 | 3.9280 | 4.7682 | 1.0000 | 1.04 |
y≥In Hospital/Facility | -0.7574 | -0.7596 | 0.1927 | -1.1530 | -0.3985 | 0.0000 | 1.00 |
y≥Home | -7.5974 | -7.5987 | 0.2094 | -8.0231 | -7.1904 | 0.0000 | 1.01 |
day | -0.0984 | -0.0982 | 0.0238 | -0.1462 | -0.0530 | 0.0000 | 0.99 |
day’ | 0.1113 | 0.1089 | 0.1029 | -0.0975 | 0.3105 | 0.8672 | 1.01 |
day’’ | -0.1662 | -0.1589 | 0.2692 | -0.7085 | 0.3535 | 0.2590 | 1.02 |
pstatus=In Hospital/Facility | 5.5589 | 5.5593 | 0.1573 | 5.2600 | 5.8712 | 1.0000 | 1.00 |
pstatus=Home | 16.0457 | 15.8795 | 1.9105 | 12.5905 | 19.8092 | 1.0000 | 1.26 |
treatment=Vitamin D | 0.0684 | 0.0682 | 0.0519 | -0.0337 | 0.1731 | 0.9112 | 1.04 |
age | 0.0005 | 0.0005 | 0.0037 | -0.0068 | 0.0078 | 0.5612 | 0.98 |
age’ | -0.0052 | -0.0052 | 0.0035 | -0.0117 | 0.0022 | 0.0690 | 1.01 |
lips | -0.0241 | -0.0240 | 0.0085 | -0.0421 | -0.0087 | 0.0018 | 0.97 |
charlson | -0.0640 | -0.0640 | 0.0112 | -0.0866 | -0.0430 | 0.0000 | 1.01 |
sofa | -0.0379 | -0.0379 | 0.0077 | -0.0526 | -0.0218 | 0.0000 | 0.99 |
day × pstatus=In Hospital/Facility | 0.1909 | 0.1905 | 0.0297 | 0.1358 | 0.2514 | 1.0000 | 1.08 |
day’ × pstatus=In Hospital/Facility | -0.6157 | -0.6128 | 0.1263 | -0.8837 | -0.3799 | 0.0000 | 0.98 |
day’’ × pstatus=In Hospital/Facility | 1.3751 | 1.3711 | 0.3302 | 0.7608 | 2.0606 | 1.0000 | 1.00 |
day × pstatus=Home | -0.1314 | -0.1189 | 0.2581 | -0.6331 | 0.3693 | 0.3212 | 0.83 |
day’ × pstatus=Home | 0.4429 | 0.4277 | 0.6599 | -0.8189 | 1.7658 | 0.7465 | 1.07 |
day’’ × pstatus=Home | -0.9955 | -0.9761 | 1.4228 | -3.8657 | 1.7362 | 0.2462 | 0.95 |
Computing Environment
R version 4.0.4 (2021-02-15) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Pop!_OS 20.10 Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] data.table_1.13.2 rstan_2.21.2 StanHeaders_2.21.0-6 [4] rmsb_0.0.2 rms_6.1-2 SparseM_1.78 [7] Hmisc_4.5-0 ggplot2_3.3.2 Formula_1.2-4 [10] survival_3.2-7 lattice_0.20-41To cite R in publication use:
R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.