--- title: Examples of Markov Ordinal Modeling on Two Datasets author: | | Frank Harrell | Department of Biostatistics | Vanderbilt University School of Medicine date: "`r Sys.Date()`" output: rmdformats::readthedown: thumbnails: no lightbox: no gallery: no highlight: tango use_bookdown: yes toc_depth: 3 fig_caption: yes code_folding: show embed_fonts: no keep_md: false pdf_document: latex_engine: lualatex toc: true toc_depth: 3 fig_caption: yes keep_tex: true extra_dependencies: ["float"] urlcolor: blue linkcolor: blue header-includes: \usepackage{longtable} link-citations: yes description: Examples of Markov Ordinal Modeling Using Two Datasets --- # Introduction This report provides analyses of two simulated datasets (available [here](../../Hmisc/markov/d1.csv) and [here](../../Hmisc/markov/d2.csv)) provided by Erica Brittain PhD, Mathematical Statistician and Deputy Branch Chief of the Biostatistics Research Branch of NIAID/NIH. In both datasets, the primary outcome of interest is recovery, death may occur before the patient has a chance to recover, and all patients who failed to recovered died, so there is no censoring in the usual sense. A first-order Markov partial proportional odds (PO) model is used to model time and treatment effects on a 3-level ordinal outcome assessed daily: Y=0 (recovered), 1 (not recovered but alive), 2 (dead). Y=0 and 2 are absorbing states. Of interest is a comparison of the treatment odds ratio from this discrete time state transition model with the hazard ratio from a time-to-recovery outcome using a Fine and Grey competing risk model. In such a model the cumulative incidence function is estimated in a special way but the Cox proportional hazards model hazard ratio is estimated using the original Cox partial likelihood with deaths counted as censored time to recovery at the last day of follow-up. The ordinal longitudinal model used here has a linear time effect, either a single treatment parameter (log transition odds ratio) or event-specific log odds ratios to relax the PO assumption for treatment, and allows for the time effect to not be in proportional odds. Note that since there are only 3 Y levels, the partial PO model used here is the same as the constrained partial PO model. Partial and constrained partial PO models were developed by [Peterson & Harrell 1990](https://www.jstor.org/stable/2347760?seq=1). Note that the ordinal Markov models below can easily be fitted using a Bayesian approach using the R `rmsb` package. It is also important to note that * most studies are not adequately powered for detecting a pure mortality effect * if no PO assumption is made for treatment, the estimated mortality effect will be disconnected from the clinical recovery effect without any sharing of information, so standard errors will be increased and power reduced * perhaps the most cohesive approach is to put a Bayesian prior probability on the dissimilarity of treatment effects on recovery and death so that some borrowing of information across these outcomes can take place, preventing the required sample size from not being greatly increased * there is some initial indication that when integrative measures are used to assess treatment effectiveness (e.g., mean time dead or not recovered, which for our situation is the mean time with Y>0), the loss of power is less than one would get by estimating separate outcome-specific odds ratios # R Setup ```{r setup} outfmt <- if(knitr::is_html_output ()) 'html' else 'pdf' markup <- if(knitr::is_latex_output()) 'latex' else 'html' require(rms) # automatically engages rms(Hmisc) require(data.table) # used to compute various summary measures require(VGAM) # partial PO model if(outfmt == 'html') require(plotly) fdev <- switch(outfmt, html='png', pdf='pdf') knitrSet(lang='markdown', w=7, h=7, dev=fdev, fig.path=paste0(fdev, '/d2-')) options(prType=markup) ## If producing html, ggplot2 graphics are converted to plotly graphics ## so that hovertext will show extra information ## Assumes aes(..., label=txt) used ggp <- if(outfmt != 'html') function(ggobject, ...) ggobject else Hmisc::ggplotlyr # When outputting to pdf, make LaTeX place figures right at the point # in which they are created; requires float.sty (see yaml header) if(outfmt == 'pdf') knitr::opts_chunk$set(fig.pos = 'H', out.extra='') # Function in Hmisc that will sense chunk has results='asis' and # print in Rmarkdown verbatim mode if needed pr <- markupSpecs$markdown$pr ``` # Dataset 1 In this dataset, both arms have exactly the same recovery times among those who recover, and the arms differ with respect to death times. The treatment arm has all early deaths, and the control arm has late deaths---but the same number of deaths. The simple logrank with Fine-Grey sees zero difference in these arms (pvalue is 1 and HR estimate is 1). `Event=’REC’` means the subject recovered, and `‘DTH’` means they died. ## Cox Model Analysis Run an ordinary Cox model analysis after constructing a variable that censors death at the last follow-up time. ```{r cox1,results='asis'} d <- csv.get('d1.csv') setDT(d) d[, ttr := ifelse(event == 'Rec', t, 30L)] d[, ec := ifelse(event == 'Rec', 1, 0)] cox <- function(fg=FALSE) { # Jon Fintzi of NIAID/NIH provided code to run a Fine-Grey model # This will differ from the ordinary Cox model if there are any censored # observations for reasons other than death. Set fg=TRUE if that pertains. if(fg) { require(survival) require(cmprsk2) } dd <- datadist(d); options(datadist=dd) f <- cph(Surv(ttr, ec) ~ treat, data=d) g <- if(fg) { dc <- d dc[, c('id', 't0', 't1', 'status') := list(1:.N, 0, t, factor(c(Rec=1, Dth=2)[event], 0:2, c('censor', 'recover', 'death')))] dfg <- finegray(Surv(time = t0, time2 = t1, status) ~ ., data = dc, id = id) coxph(Surv(fgstart, fgstop, fgstatus) ~ treat, weight=fgwt, data=dfg, ties="efron", id = id) } z <- cox.zph(f, 'identity') pr('Test of PH', z) plot(z) list(cox=f, fg=g, hr=exp(coef(if(fg) g else f))) } w <- cox() w$cox if(length(w$fg)) pr('Fine-Grey Cox model', w$fg) ``` ## Partial PO Markov Model A tall and thin dataset is constructed with daily Y=0,1,2. A second version of the tall and thin dataset carries recovery and death forward so that proportions of all daily state occupancies can easily be plotted. ```{r m1} revamp <- function() { d[, id := 1:.N] w <- d[, .(day = 1 : t, y = c(rep(1, t - 1), ifelse(event == 'Rec', 0L, 2L)), treat = rep(treat, t)), by=id] wcf <- d[, .(day = 1 : 30, y = c(rep(1L, t - 1), rep(ifelse(event == 'Rec', 0L, 2L), 31L - t)), treat = paste('Treatment', rep(treat, 30))), by=id] xl <- rep('', 30); at <- seq(5, 30, by=5) xl[at] <- format(at) setkey(wcf, treat, day) prop <- wcf[, .(p=mean(y > 0)), by=.(treat, day)] mt <- prop[, .(mtu=sum(p)), by=treat] mtu <- round(mt[, mtu], 2) mtulab <- paste0('Mean time unrecovered for treatments 0, 1: ', mtu[1], ', ', mtu[2]) print(propsPO(y ~ day + treat, data=wcf) + scale_x_discrete(breaks=1:30, labels=xl) + labs(caption='recovered: y=0\nalive, not recovered: y=1\ndead: y=2', title='State occupancy proportions', subtitle=mtulab)) w } w <- revamp() ``` Fit the Markov model, with and without assuming PO for treatment, and with and without including a treatment $\times$ time interaction. ```{r m1b} pvglm <- function(f) { se <- sqrt(diag(vcov(f))) print(cbind(beta=coef(f), se=se, z=coef(f) / se), digits=3) invisible() } # lrtest in VGAM did not use the right d.f. for partial PO models lrt <- function(f1, f2) { chisq <- 2 * (logLik(f2) - logLik(f1)) df <- length(coef(f2)) - length(coef(f1)) w <- data.frame('LR chi-square'=round(chisq, 2), 'd.f.'=df, p=round(1 - pchisq(chisq, df), 4)) print(w, row.names=FALSE) invisible() } # Compute and plot state occupancy probabilities for both treatment groups, and plot # Also compute mean time unrecovered, by treatment sop <- function(fit, desc) { revo <- function(z) { z <- as.factor(z) factor(z, levels=rev(levels(as.factor(z)))) } z <- NULL for(tx in 0 : 1) { s <- soprobMarkovOrdm(fit, data.frame(treat=tx, yprev=NA), times=1 : 30, ylevels=0:2, absorb=c(0, 2), tvarname='day') mtu <- sum(1. - s[, 1]) if(tx == 0) mtu0 <- mtu else mtu1 <- mtu u <- data.frame(day=as.vector(row(s)), y=as.vector(col(s) - 1), p=as.vector(s)) u$treat <- tx z <- rbind(z, u) } mtulab <- paste0('Mean time unrecovered for treatments 0, 1: ', round(mtu0, 2), ', ', round(mtu1, 2)) g <- ggplot(z, aes(x=day, y=p, fill=revo(y))) + facet_wrap(~ paste('Treatment', treat), nrow=1) + geom_col() + ylab('Probability') + guides(fill=guide_legend(title='y')) + labs(caption='recovered: y=0\nalive, not recovered: y=1\ndead: y=2', title=paste('State occupancy proportions', desc), subtitle=mtulab) print(g) } mmod <- function() { # Start with PO for treatment cat('\nAssuming PO for treatment, no treatment x time interaction\n\n') f <- vglm(y ~ day + treat, cumulative(reverse=TRUE, parallel = FALSE ~ day), data=w) pvglm(f) or <- exp(coef(f)['treat']) pr('Transition odds ratio', or) fpoa <- f sop(f, 'for model with PO for tx, not for time, no tx x time interaction') cat('\nAssuming PO for treatment, allowing treatment x time interaction\n\n') f <- vglm(y ~ day * treat, cumulative(reverse=TRUE, parallel = FALSE ~ day), data=w) pvglm(f) k <- coef(f) a <- k['treat'] b <- k['day:treat'] con <- a + b * c(5, 30) or <- exp(coef(f)['treat']) pr('Transition odds ratio at days 5 and 30', exp(con)) fpoia <- f sop(f, 'for model with PO for tx, not for time, with tx x time interaction') cat('\nNot assuming PO for treatment, no treatment x time interaction\n\n') f <- vglm(y ~ day + treat, cumulative(reverse=TRUE, parallel = FALSE ~ day + treat), data=w) pvglm(f) or <- exp(coef(f)[c('treat:1', 'treat:2')]) pr('Transition odds ratios for recovery and for death', or) fppoa <- f sop(f, 'for model with non-PO for tx and time, no tx x time interaction') pr('Test of PO for treatment without time x treatment', lrt(fpoa, fppoa)) cat('\nNot assuming PO for treatment, allowing treatment x time interaction\n\n') f <- vglm(y ~ day * treat, cumulative(reverse=TRUE, parallel = FALSE ~ day * treat), data=w) pvglm(f) k <- coef(f) a <- k[c('treat:1', 'treat:2')] b <- k[c('day:treat:1', 'day:treat:2')] for(i in 1 : 2) { con <- a[i] + b[i] * c(5, 30) pr(paste0('Transition odds ratio for Y>=', i, ' at days 5 and 30'), exp(con)) } fppoia <- f sop(f, 'for model with non-PO for tx and time, with tx x time interaction') pr('Test of PO for treatment with time x treatment', lrt(fpoia, fppoia)) } mmod() ``` We see a fairly strong non-PO for time (day) in both models. Consider the pair of models without treatment $\times$ time interaction, The transition odds ratio in the model that assumes a common effect of treatment on recovery and on death indicates a slight net benefit for treatment 1. When the PO assumption is not made for treatment, the model estimates that treatment 1 improves recovery but hastens death. The regression coefficients are equal but in opposite directions. The statistical evidence for inconsistency of treatment effects from the above likelihood ratio $\chi^2$ test comparing the two models is given by $P=0.06$. When time $\times$ treatment interaction is included, there is strong evidence for such interactions, and in the model that also allowed for a different effect on mortality, there is a huge time $\times$ treatment interaction. Though in the simulated dataset the time-related treatment effect for mortality is very discontinuous, allowing for a linear time interaction much better captures the sudden mortality increase at day 5 for treatment 1 than when the time interaction is excluded altogether. # Dataset 2 The second data set is more realistic. This set up assumes 30 day follow up. The logrank/Fine-Grey (i.e., all deaths are analyzed as if censored at 30 days) for the second data set has a p-value of .0225 and HR estimate of 1.441. Those numbers are not exactly reproduced below, because we have dropped three patients whose endpoint occurred at time 0. ## Cox Model Analysis First we run a Cox model analysis after constructing a variable that censors death at the last follow-up time. ```{r cox2,results='asis'} d <- csv.get('d2.csv') setDT(d) d <- d[t > 0, ] d[, ttr := ifelse(event == 'Rec', t, 30L)] d[, ec := ifelse(event == 'Rec', 1, 0)] w <- cox() w$cox if(length(w$fg)) pr('Fine-Grey Cox model', w$fg) hr <- w$hr ``` The reciprocal of the time to recovery HR is `r round(1/hr, 3)`, indicating that treatment 1 has shorter time to recovery than treatment 0. ## Partial PO Markov Model Fit the Markov model, with and without assuming PO for treatment. ```{r m2b} w <- revamp() mmod() ``` We see less strong non-PO for time (day) in these models. Consider the models excluding treatment $\times$ time interaction. The transition odds ratio in the model that assumes a common effect of treatment on recovery and on death indicates a strong net benefit for treatment 1. When the PO assumption is not made for treatment, the model estimates that treatment 1 improves recovery even more strongly, but it weakly hastens death ($P=0.68$). The statistical evidence for inconsistency of treatment effects from the above likelihood ratio $\chi^2$ test comparing the two models is given by $P=0.003$. There is some evidence for time $\times$ treatment interaction in the model that assumes PO for treatment. This is less pronounced in the more general model. # Bayesian Modeling Using Dataset 2 ## No Time $\times$ Treatment Interaction We repeat the last model fitted in the previous section that excluded time $\times$ treatment iteraction, but now using a Bayesian MCMC procedure. This will allow us to easily get uncertainties about any derived quantities of interest, including the mean number of days unrecovered, i.e., the mean number of days over a 30-day observation period in which a patient is sick or dead. Note that the `blrm` function parameterizes partial PO effects differently than `vglm`. `blrm` terms for departures from PO are increments in the log odds over and above the main effect for the variable in question. In this setting the departure of interest represents the extra treatment effect on death. ```{r m2bayes,results='asis'} stanSet() # use maximum number of CPU cores require(rmsb) set.seed(1) dd <- datadist(w); options(datadist='dd') # The following took 30s bmark2 <- blrm(y ~ day + treat, ~ day + treat, cppo = function(y) y, data=w, file='bmark2.rds') bmark2 ``` There is strong evidence for a different treatment effect on death than for recovery (posterior probability of this effect is 0.9982 under flat priors). There is little evidence for non-PO with respect to time (posterior probability 0.48). Compute the two odds ratios for treatment. The first, for recovery, using a cut of $y \geq 1$ and the second, for mortality, uses a cut of $y \geq 2$ which here is the same as $y=2$. Here we are only looking at posterior mean parameter values. ```{r m2bayes2,results='asis'} summary(bmark2, ycut=1) summary(bmark2, ycut=2) ``` The posterior mean odds ratios for the two events are 0.37 and 1.29, very close to the maximum likelihood estimates in the earlier frequentist analysis. Note the great uncertainty in the treatment effect on mortality (0.95 highest posterior density interval for the OR is 0.63 to 2.89). ## With Time $\times$ Treatment Interaction Now the model is generalized to allow for time $\times$ treatment interaction. ```{r m2bayesia,results='asis'} set.seed(2) # The following took 33s bmark2ia <- blrm(y ~ day * treat, ~ day * treat, cppo = function(y) y, data=w, file='bmark2ia.rds') bmark2ia ``` Since the degree of non-PO for treatment was allowed to vary with time, there is no single posterior probability for the non-PO treatment effect. Let's compute time-specific posterior probabilities of inconsistent recovery and death effects. This is done by computing the proportion of 4000 posterior draws for which the day-specific treatment effect on (non-)recovery (y=1) is less than the effect on death (y=2). ```{r m2bayesia2} k <- contrast(bmark2ia, list(treat=1, day=1:30), list(treat=0, day=1:30), ycut=1)$cdraws - contrast(bmark2ia, list(treat=1, day=1:30), list(treat=0, day=1:30), ycut=2)$cdraws p <- apply(k, 2, function(x) mean(x < 0)) ggplot(data.frame(day=1:30, p=p), aes(x=day, y=p)) + geom_line() + xlab('Day') + ylab('P(treatment effect greater for death than for recovery)') ``` ## Analysis of Mean Time Unrecovered Accounting for Uncertainties A significant advantage of an integrated summary such as mean time unrecovered (time with Y > 0) is that such measures may be interpreted easily no matter how complex the model, and one does not need to choose a time horizon, assuming that all follow-up time intervals are equally important. So if a model contains differential (non-PO) treatment effects for different events and/or contains time $\times$ treatment interactions as our model does, one can still compute the integrated summary. A choice of follow-up time does not need to be made, but one must specify the states whose occupancy probabilities are being summed. Under non-PO for treatment, patterns of mean time unrecovered may be dissimilar to patterms for mean time alive. Compute for each treatment the posterior distribution of the mean time unrecovered. The mean time unrecovered is the sum over the 30 days of P(Y > 0) = 1 - P(Y = 0). Compute this sum for each of the 4000 posterior samples to reflect uncertainties in occupancy probabilities. Note that as long as time intervals are regular, mean time in state has no problem with discrete time, whereas the median because of poor handling of ties requires measurement resolution in hours in order to reliably estimate median in days. ```{r m2bayes3} mtu <- matrix(NA, nrow=4000, ncol=3, dimnames=list(NULL, c('Treatment 0', 'Treatment 1', 'Delta'))) for(tx in 0:1) { # s is a 4000 (posterior draws) x 30 (days) x 3 (y levels) array s <- soprobMarkovOrdm(bmark2ia, data.frame(treat=tx, yprev=0), times=1:30, ylevels=0:2, absorb=c(0,2), tvarname='day') s <- 1. - s[, , 1] # P(Y > 0) 4000 x 30 mtu[, tx + 1] <- rowSums(s) } mtu[, 3] <- mtu[, 2] - mtu[, 1] g <- function(x) c(Mean=mean(x), HPDint(x)) apply(mtu, 2, g) delta <- mtu[, 3] # Compute P(treatment 1 lowers number of unrecovered days) mean(delta < 0) # Compute simulation error in this calculation binconf(sum(delta < 0), length(delta)) # Wilson 0.95 confidence interval # Prob at least two days lower mean(delta < -2) ``` The posterior mean of the difference in expected number of days unrecovered is -4.5 in favor of treatment 1, with 0.95 highest posterior density uncertainty interval of [-7.1, -1.6]. The posterior distribution of the differences is below. The posterior probability that treatment 1 lowers time unrecovered by some amount is 0.99975 with an MCMC simulation error of 0.001, given flat priors. The probability that it lowers time unrecovered by at least 2d is 0.96. ```{r m2bayes4} ggplot(data.frame(x=delta), aes(x)) + geom_density() + xlab('Treatment 1 - Treatment 0 Mean # Days Unrecovered') + ylab('') ``` # Computing Environment `r markupSpecs$html$session()`