7 Modeling Longitudinal Responses using Generalized Least Squares

Some good general references on longitudinal data analysis are Davis (2002), Pinheiro & Bates (2000), Diggle et al. (2002), Venables & Ripley (2003), Hand & Crowder (1996), Verbeke & Molenberghs (2000), Lindsey (1997)

7.1 Notation

\(N\) subjects
Subject \(i\) (\(i=1,2,\ldots,N\)) has \(n_{i}\) responses measured at times \(t_{i1}, t_{i2}, \ldots, t_{in_{i}}\)
Response at time \(t\) for subject \(i\): \(Y_{it}\)
Subject \(i\) has baseline covariates \(X_{i}\)
Generally the response measured at time \(t_{i1}=0\) is a covariate in \(X_{i}\) instead of being the first measured response \(Y_{i0}\)
Time trend in response is modeled with \(k\) parameters so that the time “main effect” has \(k\) d.f.
Let the basis functions modeling the time effect be \(g_{1}(t), g_{2}(t), \ldots, g_{k}(t)\)

7.2 Model Specification for Effects on \(E(Y)\)

7.2.1 Common Basis Functions

\(k\) dummy variables for \(k+1\) unique times (assumes no functional form for time but may spend many d.f.)
\(k=1\) for linear time trend, \(g_{1}(t)=t\)
\(k\)–order polynomial in \(t\)
\(k+1\)–knot restricted cubic spline (one linear term, \(k-1\) nonlinear terms)

7.2.2 Model for Mean Profile

A model for mean time-response profile without interactions between time and any \(X\):
\(E[Y_{it} | X_{i}] = X_{i}\beta + \gamma_{1}g_{1}(t) + \gamma_{2}g_{2}(t) + \ldots + \gamma_{k}g_{k}(t)\)
Model with interactions between time and some \(X\)’s: add product terms for desired interaction effects
Example: To allow the mean time trend for subjects in group 1 (reference group) to be arbitrarily different from time trend for subjects in group 2, have a dummy variable for group 2, a time “main effect” curve with \(k\) d.f. and all \(k\) products of these time components with the dummy variable for group 2
Time should be modeled using indicator variables only when time is really discrete, e.g., when time is in weeks and subjects were followed at exactly the intended weeks. In general time should be modeled continuously (and nonlinearly if there are more than 2 followup times) using actual visit dates instead of intended dates (Donohue et al., n.d.).

7.2.3 Model Specification for Treatment Comparisons

In studies comparing two or more treatments, a response is often measured at baseline (pre-randomization)
Analyst has the option to use this measurement as \(Y_{i0}\) or as part of \(X_{i}\)

Comments from Jim Rochon

For RCTs, I draw a sharp line at the point when the intervention begins. The LHS [left hand side of the model equation] is reserved for something that is a response to treatment. Anything before this point can potentially be included as a covariate in the regression model. This includes the “baseline” value of the outcome variable. Indeed, the best predictor of the outcome at the end of the study is typically where the patient began at the beginning. It drinks up a lot of variability in the outcome; and, the effect of other covariates is typically mediated through this variable.

I treat anything after the intervention begins as an outcome. In the western scientific method, an “effect” must follow the “cause” even if by a split second.

Note that an RCT is different than a cohort study. In a cohort study, “Time 0” is not terribly meaningful. If we want to model, say, the trend over time, it would be legitimate, in my view, to include the “baseline” value on the LHS of that regression model.

Now, even if the intervention, e.g., surgery, has an immediate effect, I would include still reserve the LHS for anything that might legitimately be considered as the response to the intervention. So, if we cleared a blocked artery and then measured the MABP, then that would still be included on the LHS.

Now, it could well be that most of the therapeutic effect occurred by the time that the first repeated measure was taken, and then levels off. Then, a plot of the means would essentially be two parallel lines and the treatment effect is the distance between the lines, i.e., the difference in the intercepts.

If the linear trend from baseline to Time 1 continues beyond Time 1, then the lines will have a common intercept but the slopes will diverge. Then, the treatment effect will the difference in slopes.

One point to remember is that the estimated intercept is the value at time 0 that we predict from the set of repeated measures post randomization. In the first case above, the model will predict different intercepts even though randomization would suggest that they would start from the same place. This is because we were asleep at the switch and didn’t record the “action” from baseline to time 1. In the second case, the model will predict the same intercept values because the linear trend from baseline to time 1 was continued thereafter.

More importantly, there are considerable benefits to including it as a covariate on the RHS. The baseline value tends to be the best predictor of the outcome post-randomization, and this maneuver increases the precision of the estimated treatment effect. Additionally, any other prognostic factors correlated with the outcome variable will also be correlated with the baseline value of that outcome, and this has two important consequences. First, this greatly reduces the need to enter a large number of prognostic factors as covariates in the linear models. Their effect is already mediated through the baseline value of the outcome variable. Secondly, any imbalances across the treatment arms in important prognostic factors will induce an imbalance across the treatment arms in the baseline value of the outcome. Including the baseline value thereby reduces the need to enter these variables as covariates in the linear models.

Senn (2006) states that temporally and logically, a “baseline cannot be a response to treatment”, so baseline and response cannot be modeled in an integrated framework.

… one should focus clearly on ‘outcomes’ as being the only values that can be influenced by treatment and examine critically any schemes that assume that these are linked in some rigid and deterministic view to ‘baseline’ values. An alternative tradition sees a baseline as being merely one of a number of measurements capable of improving predictions of outcomes and models it in this way.

The final reason that baseline cannot be modeled as the response at time zero is that many studies have inclusion/exclusion criteria that include cutoffs on the baseline variable. In other words, the baseline measurement comes from a truncated distribution. In general it is not appropriate to model the baseline with the same distributional shape as the follow-up measurements. Thus the approaches recommended by Liang & Zeger (2000) and Liu et al. (2009) are problematic¹.

¹ In addition to this, one of the paper’s conclusions that analysis of covariance is not appropriate if the population means of the baseline variable are not identical in the treatment groups is not correct (Senn, 2006). See Kenward et al. (2010) for a rebuke of Liu et al. (2009).

7.3 Modeling Within-Subject Dependence

Random effects and mixed effects models have become very popular
Disadvantages:
- Induced correlation structure for \(Y\) may be unrealistic
- Numerically demanding
- Require complex approximations for distributions of test statistics
Conditional random effects vs. (subject-) marginal models:
- Random effects are subject-conditional
- Random effects models are needed to estimate responses for individual subjects
- Models without random effects are marginalized with respect to subject-specific effects
- They are natural when the interest is on group-level (i.e., covariate-specific but not patient-specific) parameters (e.g., overall treatment effect)
- Random effects are natural when there is clustering at more than the subject level (multi-level models)
Extended linear model (marginal; with no random effects) is a logical extension of the univariate model (e.g., few statisticians use subject random effects for univariate \(Y\))
This was known as growth curve models and generalized least squares (Goldstein, 1989; Potthoff & Roy, 1964) and was developed long before mixed effect models became popular
Pinheiro and Bates (Section~5.1.2) state that “in some applications, one may wish to avoid incorporating random effects in the model to account for dependence among observations, choosing to use the within-group component \(\Lambda_{i}\) to directly model variance-covariance structure of the response.”
We will assume that \(Y_{it} | X_{i}\) has a multivariate normal distribution with mean given above and with variance-covariance matrix \(V_{i}\), an \(n_{i}\times n_{i}\) matrix that is a function of \(t_{i1}, \ldots, t_{in_{i}}\)
We further assume that the diagonals of \(V_{i}\) are all equal
Procedure can be generalized to allow for heteroscedasticity over time or with respect to \(X\) (e.g., males may be allowed to have a different variance than females)
This extended linear model has the following assumptions:
- all the assumptions of OLS at a single time point including correct modeling of predictor effects and univariate normality of responses conditional on \(X\)
- the distribution of two responses at two different times for the same subject, conditional on \(X\), is bivariate normal with a specified correlation coefficient
- the joint distribution of all \(n_{i}\) responses for the \(i^{th}\) subject is multivariate normal with the given correlation pattern (which implies the previous two distributional assumptions)
- responses from any times for any two different subjects are uncorrelated

FGH

What Methods To Use for Repeated Measurements / Serial Data? ² ³
	Repeated Measures ANOVA	GEE	Mixed Effects Models	GLS	Markov	LOCF	Summary Statistic⁴
Assumes normality	×		×	×
Assumes independence of measurements within subject	×⁵	×⁶
Assumes a correlation structure⁷	×	×⁸	×	×	×
Requires same measurement times for all subjects	×					?
Does not allow smooth modeling of time to save d.f.	×
Does not allow adjustment for baseline covariates	×
Does not easily extend to non-continuous \(Y\)	×			×
Loses information by not using intermediate measurements						×⁹	×
Does not allow widely varying # observations per subject	×	×¹⁰				×	×¹¹
Does not allow for subjects to have distinct trajectories¹²	×	×		×	×	×
Assumes subject-specific effects are Gaussian			×
Badly biased if non-random dropouts	?	×				×
Biased in general						×
Harder to get tests & CLs			×¹³			×¹⁴
Requires large # subjects/clusters		×
SEs are wrong	×¹⁵					×
Assumptions are not verifiable in small samples	×	N/A	×	×		×
Does not extend to complex settings such as time-dependent covariates and dynamic ¹⁶ models	×		×	×		×	?

² Thanks to Charles Berry, Brian Cade, Peter Flom, Bert Gunter, and Leena Choi for valuable input.

³ GEE: generalized estimating equations; GLS: generalized least squares; LOCF: last observation carried forward.

⁴ E.g., compute within-subject slope, mean, or area under the curve over time. Assumes that the summary measure is an adequate summary of the time profile and assesses the relevant treatment effect.

⁵ Unless one uses the Huynh-Feldt or Greenhouse-Geisser correction

⁶ For full efficiency, if using the working independence model

⁷ Or requires the user to specify one

⁸ For full efficiency of regression coefficient estimates

⁹ Unless the last observation is missing

¹⁰ The cluster sandwich variance estimator used to estimate SEs in GEE does not perform well in this situation, and neither does the working independence model because it does not weight subjects properly.

¹¹ Unless one knows how to properly do a weighted analysis

¹² Or users population averages

¹³ Unlike GLS, does not use standard maximum likelihood methods yielding simple likelihood ratio \(\chi^2\) statistics. Requires high-dimensional integration to marginalize random effects, using complex approximations, and if using SAS, unintuitive d.f. for the various tests.

¹⁴ Because there is no correct formula for SE of effects; ordinary SEs are not penalized for imputation and are too small

¹⁵ If correction not applied

¹⁶ E.g., a model with a predictor that is a lagged value of the response variable

Markov models use ordinary univariate software and are very flexible
They apply the same way to binary, ordinal, nominal, and continuous Y
They require post-fitting calculations to get probabilities, means, and quantiles that are not conditional on the previous Y value

Gardiner et al. (2009) compared several longitudinal data models, especially with regard to assumptions and how regression coefficients are estimated. Peters et al. (2012) have an empirical study confirming that the “use all available data” approach of likelihood–based longitudinal models makes imputation of follow-up measurements unnecessary.

7.4 Parameter Estimation Procedure

Generalized least squares
Like weighted least squares but uses a covariance matrix that is not diagonal
Each subject can have her own shape of \(V_{i}\) due to each subject being measured at a different set of times
Maximum likelihood
Newton-Raphson or other trial-and-error methods used for estimating parameters
For small number of subjects, advantages in using REML (restricted maximum likelihood) instead of ordinary MLE (Diggle et al., 2002, p. Section~5.3), (Pinheiro & Bates, 2000, p. Chapter~5), Goldstein (1989) (esp. to get more unbiased estimate of the covariance matrix)
When imbalances are not severe, OLS fitted ignoring subject identifiers may be efficient
- But OLS standard errors will be too small as they don’t take intra-cluster correlation into account
- May be rectified by substituting covariance matrix estimated from Huber-White cluster sandwich estimator or from cluster bootstrap
When imbalances are severe and intra-subject correlations are strong, OLS is not expected to be efficient because it gives equal weight to each observation
- a subject contributing two distant observations receives \(\frac{1}{5}\) the weight of a subject having 10 tightly-spaced observations

KLM

7.5 Common Correlation Structures

Usually restrict ourselves to isotropic correlation structures — correlation between responses within subject at two times depends only on a measure of distance between the two times, not the individual times
We simplify further and assume depends on \(|t_{1} - t_{2}|\)
Can speak interchangeably of correlations of residuals within subjects or correlations between responses measured at different times on the same subject, conditional on covariates \(X\)
Assume that the correlation coefficient for \(Y_{it_{1}}\) vs. \(Y_{it_{2}}\) conditional on baseline covariates \(X_{i}\) for subject \(i\) is \(h(|t_{1} - t_{2}|, \rho)\), where \(\rho\) is a vector (usually a scalar) set of fundamental correlation parameters
Some commonly used structures when times are continuous and are not equally spaced (Pinheiro & Bates, 2000, Section 5.3.3) (nlme correlation function names are at the right if the structure is implemented in nlme):

Table 7.1: Some longitudinal data correlation structures

Structure	`nlme` Function
Compound symmetry: \(h = \rho\) if \(t_{1} \neq t_{2}\), 1 if \(t_{1}=t_{2}\) ¹⁷	`corCompSymm`
Autoregressive-moving average lag 1: \(h = \rho^{\|t_{1} - t_{2}\|} = \rho^s\) where \(s = \|t_{1}-t_{2}\|\)	`corCAR1`
Exponential: \(h = \exp(-s/\rho)\)	`corExp`
Gaussian: \(h = \exp[-(s/\rho)^2]\)	`corGaus`
Linear: \(h = (1 - s/\rho)[s < \rho]\)	`corLin`
Rational quadratic: \(h = 1 - (s/\rho)^{2}/[1+(s/\rho)^{2}]\)	`corRatio`
Spherical: \(h = [1-1.5(s/\rho)+0.5(s/\rho)^{3}][s < \rho]\)	`corSpher`
Linear exponent AR(1): \(h = \rho^{d_{min} + \delta\frac{s - d_{min}}{d_{max} - d_{min}}}\), 1 if \(t_{1}=t_{2}\)	Simpson et al. (2010)

¹⁷ Essentially what two-way ANOVA assumes

The structures 3-7 use \(\rho\) as a scaling parameter, not as something restricted to be in \([0,1]\)

7.6 Checking Model Fit

Constant variance assumption: usual residual plots
Normality assumption: usual qq residual plots
Correlation pattern: Variogram
- Estimate correlations of all possible pairs of residuals at different time points
- Pool all estimates at same absolute difference in time \(s\)
- Variogram is a plot with \(y = 1 - \hat{h}(s, \rho)\) vs. \(s\) on the \(x\)-axis
- Superimpose the theoretical variogram assumed by the model

7.7 `R` Software

Nonlinear mixed effects model package of Pinheiro & Bates
For linear models, fitting functions are
- lme for mixed effects models
- gls for generalized least squares without random effects
For this version the rms package has Gls so that many features of rms can be used:
- anova: all partial Wald tests, test of linearity, pooled tests
- summary: effect estimates (differences in \(\hat{Y}\)) and confidence limits, can be plotted
- plot, ggplot, plotp: continuous effect plots
- nomogram: nomogram
- Function: generate R function code for fitted model
- latex: representation of fitted model

In addition, Gls has a bootstrap option (hence you do not use rms’s bootcov for Gls fits).
To get regular gls functions named anova (for likelihood ratio tests, AIC, etc.) or summary use anova.gls or summary.gls * nlme package has many graphics and fit-checking functions * Several functions will be demonstrated in the case study

7.8 Case Study

Consider the dataset in Table~6.9 of Davis[davis-repmeas, pp. 161-163] from a multi-center, randomized controlled trial of botulinum toxin type B (BotB) in patients with cervical dystonia from nine U.S. sites.

Randomized to placebo (\(N=36\)), 5000 units of BotB (\(N=36\)), 10,000 units of BotB (\(N=37\))
Response variable: total score on Toronto Western Spasmodic Torticollis Rating Scale (TWSTRS), measuring severity, pain, and disability of cervical dystonia (high scores mean more impairment)
TWSTRS measured at baseline (week 0) and weeks 2, 4, 8, 12, 16 after treatment began
Dataset cdystonia from web site

7.8.1 Graphical Exploration of Data

Code

require(rms)
require(data.table)
options(prType='html')    # for model print, summary, anova, validate
getHdata(cdystonia)
setDT(cdystonia)          # convert to data.table
cdystonia[, uid := paste(site, id)]   # unique subject ID

# Tabulate patterns of subjects' time points
g <- function(w) paste(sort(unique(w)), collapse=' ')
cdystonia[, table(tapply(week, uid, g))]


            0         0 2 4   0 2 4 12 16       0 2 4 8    0 2 4 8 12 
            1             1             3             1             1 
0 2 4 8 12 16    0 2 4 8 16   0 2 8 12 16   0 4 8 12 16      0 4 8 16 
           94             1             2             4             1

Code

# Plot raw data, superposing subjects
xl <- xlab('Week'); yl <- ylab('TWSTRS-total score')
ggplot(cdystonia, aes(x=week, y=twstrs, color=factor(id))) +
       geom_line() + xl + yl + facet_grid(treat ~ site) +
       guides(color=FALSE)

Figure 7.1: Time profiles for individual subjects, stratified by study site and dose

Code

# Show quartiles
g <- function(x) {
  k <- as.list(quantile(x, (1 : 3) / 4, na.rm=TRUE))
  names(k) <- .q(Q1, Q2, Q3)
  k
}
cdys <- cdystonia[, g(twstrs), by=.(treat, week)]
ggplot(cdys, aes(x=week, y=Q2)) + xl + yl + ylim(0, 70) +
  geom_line() + facet_wrap(~ treat, nrow=2) +
  geom_ribbon(aes(ymin=Q1, ymax=Q3), alpha=0.2)

Figure 7.2: Quartiles of `TWSTRS` stratified by dose

Code

# Show means with bootstrap nonparametric CLs
cdys <-  cdystonia[, as.list(smean.cl.boot(twstrs)),
                   by = list(treat, week)]
ggplot(cdys, aes(x=week, y=Mean)) + xl + yl + ylim(0, 70) +
  geom_line() + facet_wrap(~ treat, nrow=2) +
  geom_ribbon(aes(x=week, ymin=Lower, ymax=Upper), alpha=0.2)

Figure 7.3: Mean responses and nonparametric bootstrap 0.95 confidence limits for population means, stratified by dose

Model with \(Y_{i0}\) as Baseline Covariate

Code

baseline <- cdystonia[week == 0]
baseline[, week := NULL]
setnames(baseline, 'twstrs', 'twstrs0')
followup <- cdystonia[week > 0, .(uid, week, twstrs)]
setkey(baseline, uid)
setkey(followup, uid, week)
both     <- Merge(baseline, followup, id = ~ uid)

         Vars Obs Unique IDs IDs in #1 IDs not in #1
baseline    7 109        109        NA            NA
followup    3 522        108       108             0
Merged      9 523        109       109             0

Number of unique IDs in any data frame : 109 
Number of unique IDs in all data frames: 108

Code

# Remove person with no follow-up record
both     <- both[! is.na(week)]
dd       <- datadist(both)
options(datadist='dd')

7.8.2 Using Generalized Least Squares

We stay with baseline adjustment and use a variety of correlation structures, with constant variance. Time is modeled as a restricted cubic spline with 3 knots, because there are only 3 unique interior values of week.

Code

require(nlme)
cp <- list(corCAR1,corExp,corCompSymm,corLin,corGaus,corSpher)
z  <- vector('list',length(cp))
for(k in 1:length(cp)) {
  z[[k]] <- gls(twstrs ~ treat * rcs(week, 3) +
                rcs(twstrs0, 3) + rcs(age, 4) * sex, data=both,
                correlation=cp[[k]](form = ~week | uid))
}
anova(z[[1]],z[[2]],z[[3]],z[[4]],z[[5]],z[[6]])

       Model df      AIC      BIC    logLik
z[[1]]     1 20 3553.906 3638.357 -1756.953
z[[2]]     2 20 3553.906 3638.357 -1756.953
z[[3]]     3 20 3587.974 3672.426 -1773.987
z[[4]]     4 20 3575.079 3659.531 -1767.540
z[[5]]     5 20 3621.081 3705.532 -1790.540
z[[6]]     6 20 3570.958 3655.409 -1765.479

AIC computed above is set up so that smaller values are best. From this the continuous-time AR1 and exponential structures are tied for the best. For the remainder of the analysis use corCAR1, using Gls.

Keselman et al. (1998) did a simulation study to study the reliability of AIC for selecting the correct covariance structure in repeated measurement models. In choosing from among 11 structures, AIC selected the correct structure 47% of the time. Gurka et al. (2011) demonstrated that fixed effects in a mixed effects model can be biased, independent of sample size, when the specified covariate matrix is more restricted than the true one.

Code

a <- Gls(twstrs ~ treat * rcs(week, 3) + rcs(twstrs0, 3) +
         rcs(age, 4) * sex, data=both,
         correlation=corCAR1(form=~week | uid))
a

Generalized Least Squares Fit by REML

Gls(model = twstrs ~ treat * rcs(week, 3) + rcs(twstrs0, 3) + 
    rcs(age, 4) * sex, data = both, correlation = corCAR1(form = ~week | 
    uid))


Obs 522	Log-restricted-likelihood -1756.95
Clusters 108	Model d.f. 17
g 11.334	σ 8.5917
	d.f. 504

	β	S.E.	t	Pr(>\|t\|)
Intercept	-0.3093	11.8804	-0.03	0.9792
treat=5000U	0.4344	2.5962	0.17	0.8672
treat=Placebo	7.1433	2.6133	2.73	0.0065
week	0.2879	0.2973	0.97	0.3334
week'	0.7313	0.3078	2.38	0.0179
twstrs0	0.8071	0.1449	5.57	<0.0001
twstrs0'	0.2129	0.1795	1.19	0.2360
age	-0.1178	0.2346	-0.50	0.6158
age'	0.6968	0.6484	1.07	0.2830
age''	-3.4018	2.5599	-1.33	0.1845
sex=M	24.2802	18.6208	1.30	0.1929
treat=5000U × week	0.0745	0.4221	0.18	0.8599
treat=Placebo × week	-0.1256	0.4243	-0.30	0.7674
treat=5000U × week'	-0.4389	0.4363	-1.01	0.3149
treat=Placebo × week'	-0.6459	0.4381	-1.47	0.1411
age × sex=M	-0.5846	0.4447	-1.31	0.1892
age' × sex=M	1.4652	1.2388	1.18	0.2375
age'' × sex=M	-4.0338	4.8123	-0.84	0.4023

Correlation Structure: Continuous AR(1)
 Formula: ~week | uid 
 Parameter estimate(s):
      Phi 
0.8666689

\(\hat{\rho} = 0.8672\), the estimate of the correlation between two measurements taken one week apart on the same subject. The estimated correlation for measurements 10 weeks apart is \(0.8672^{10} = 0.24\).

Code

v <- Variogram(a, form=~ week | uid)
plot(v)

Figure 7.4: Variogram, with assumed correlation pattern superimposed

Check constant variance and normality assumptions:

Code

both$resid <- r <- resid(a); both$fitted <- fitted(a)
yl <- ylab('Residuals')
p1 <- ggplot(both, aes(x=fitted, y=resid)) + geom_point() +
      facet_grid(~ treat) + yl
p2 <- ggplot(both, aes(x=twstrs0, y=resid)) + geom_point()+yl
p3 <- ggplot(both, aes(x=week, y=resid)) + yl + ylim(-20,20) +
      stat_summary(fun.data="mean_sdl", geom='smooth')
p4 <- ggplot(both, aes(sample=resid)) + stat_qq() +
      geom_abline(intercept=mean(r), slope=sd(r)) + yl
gridExtra::grid.arrange(p1, p2, p3, p4, ncol=2)

Figure 7.5: Three residual plots to check for absence of trends in central tendency and in variability. Upper right panel shows the baseline score on the \(x\)-axis. Bottom left panel shows the mean \(\pm 2\times\) SD. Bottom right panel is the QQ plot for checking normality of residuals from the GLS fit.

Now get hypothesis tests, estimates, and graphically interpret the model.

Code

anova(a)

	χ²	d.f.	P
Wald Statistics for `twstrs`
treat (Factor+Higher Order Factors)	22.11	6	0.0012
All Interactions	14.94	4	0.0048
week (Factor+Higher Order Factors)	77.27	6	<0.0001
All Interactions	14.94	4	0.0048
Nonlinear (Factor+Higher Order Factors)	6.61	3	0.0852
twstrs0	233.83	2	<0.0001
Nonlinear	1.41	1	0.2354
age (Factor+Higher Order Factors)	9.68	6	0.1388
All Interactions	4.86	3	0.1826
Nonlinear (Factor+Higher Order Factors)	7.59	4	0.1077
sex (Factor+Higher Order Factors)	5.67	4	0.2252
All Interactions	4.86	3	0.1826
treat × week (Factor+Higher Order Factors)	14.94	4	0.0048
Nonlinear	2.27	2	0.3208
Nonlinear Interaction : f(A,B) vs. AB	2.27	2	0.3208
age × sex (Factor+Higher Order Factors)	4.86	3	0.1826
Nonlinear	3.76	2	0.1526
Nonlinear Interaction : f(A,B) vs. AB	3.76	2	0.1526
TOTAL NONLINEAR	15.03	8	0.0586
TOTAL INTERACTION	19.75	7	0.0061
TOTAL NONLINEAR + INTERACTION	28.54	11	0.0027
TOTAL	322.98	17	<0.0001

Code

plot(anova(a))

Figure 7.6: Results of `anova.rms` from generalized least squares fit with continuous time AR1 correlation structure

Code

ylm <- ylim(25, 60)
p1 <- ggplot(Predict(a, week, treat, conf.int=FALSE),
             adj.subtitle=FALSE, legend.position='top') + ylm
p2 <- ggplot(Predict(a, twstrs0), adj.subtitle=FALSE) + ylm
p3 <- ggplot(Predict(a, age, sex), adj.subtitle=FALSE,
             legend.position='top') + ylm
gridExtra::grid.arrange(p1, p2, p3, ncol=2)

Figure 7.7: Estimated effects of time, baseline `TWSTRS`, age, and sex

Code

summary(a)  # Shows for week 8

	Low	High	Δ	Effect	S.E.	Lower 0.95	Upper 0.95
Effects Response: `twstrs`
week	4	12	8	6.6910	1.1060	4.524	8.858
twstrs0	39	53	14	13.5500	0.8862	11.810	15.290
age	46	65	19	2.5030	2.0510	-1.518	6.523
treat --- 5000U:10000U	1	2		0.5917	1.9980	-3.325	4.508
treat --- Placebo:10000U	1	3		5.4930	2.0040	1.565	9.421
sex --- M:F	1	2		-1.0850	1.7790	-4.571	2.401

Code

# To get results for week 8 for a different reference group
# for treatment, use e.g. summary(a, week=4, treat='Placebo')

# Compare low dose with placebo, separately at each time
k1 <- contrast(a, list(week=c(2,4,8,12,16), treat='5000U'),
                  list(week=c(2,4,8,12,16), treat='Placebo'))
options(width=80)
print(k1, digits=3)

    week twstrs0 age sex Contrast S.E.  Lower  Upper     Z Pr(>|z|)
1      2      46  56   F    -6.31 2.10 -10.43 -2.186 -3.00   0.0027
2      4      46  56   F    -5.91 1.82  -9.47 -2.349 -3.25   0.0011
3      8      46  56   F    -4.90 2.01  -8.85 -0.953 -2.43   0.0150
4*    12      46  56   F    -3.07 1.75  -6.49  0.361 -1.75   0.0795
5*    16      46  56   F    -1.02 2.10  -5.14  3.092 -0.49   0.6260

Redundant contrasts are denoted by *

Confidence intervals are 0.95 individual intervals

Code

# Compare high dose with placebo
k2 <- contrast(a, list(week=c(2,4,8,12,16), treat='10000U'),
                  list(week=c(2,4,8,12,16), treat='Placebo'))
print(k2, digits=3)

    week twstrs0 age sex Contrast S.E.  Lower Upper     Z Pr(>|z|)
1      2      46  56   F    -6.89 2.07 -10.96 -2.83 -3.32   0.0009
2      4      46  56   F    -6.64 1.79 -10.15 -3.13 -3.70   0.0002
3      8      46  56   F    -5.49 2.00  -9.42 -1.56 -2.74   0.0061
4*    12      46  56   F    -1.76 1.74  -5.17  1.65 -1.01   0.3109
5*    16      46  56   F     2.62 2.09  -1.47  6.71  1.25   0.2099

Redundant contrasts are denoted by *

Confidence intervals are 0.95 individual intervals

Code

k1 <- as.data.frame(k1[c('week', 'Contrast', 'Lower', 'Upper')])
p1 <- ggplot(k1, aes(x=week, y=Contrast)) + geom_point() +
      geom_line() + ylab('Low Dose - Placebo') +
      geom_errorbar(aes(ymin=Lower, ymax=Upper), width=0)
k2 <- as.data.frame(k2[c('week', 'Contrast', 'Lower', 'Upper')])
p2 <- ggplot(k2, aes(x=week, y=Contrast)) + geom_point() +
      geom_line() + ylab('High Dose - Placebo') +
      geom_errorbar(aes(ymin=Lower, ymax=Upper), width=0)
gridExtra::grid.arrange(p1, p2, ncol=2)

Figure 7.8: Contrasts and 0.95 confidence limits from GLS fit

Although multiple d.f. tests such as total treatment effects or treatment \(\times\) time interaction tests are comprehensive, their increased degrees of freedom can dilute power. In a treatment comparison, treatment contrasts at the last time point (single d.f. tests) are often of major interest. Such contrasts are informed by all the measurements made by all subjects (up until dropout times) when a smooth time trend is assumed.

Code

n <- nomogram(a, age=c(seq(20, 80, by=10), 85))
plot(n, cex.axis=.55, cex.var=.8, lmgp=.25)  # Figure (*\ref{fig:longit-nomogram}*)

Figure 7.9: Nomogram from GLS fit. Second axis is the baseline score.

7.8.3 Bayesian Proportional Odds Random Effects Model

Develop a \(y\)-transformation invariant longitudinal model
Proportional odds model with no grouping of TWSTRS scores
Bayesian random effects model
Random effects Gaussian with exponential prior distribution for its SD, with mean 1.0
Compound symmetry correlation structure
Demonstrates a large amount of patient-to-patient intercept variability

Code

require(rmsb)
options(mc.cores=parallel::detectCores() - 1, rmsb.backend='cmdstan')
bpo <- blrm(twstrs ~ treat * rcs(week, 3) + rcs(twstrs0, 3) +
            rcs(age, 4) * sex + cluster(uid), data=both, file='bpo.rds')
# file= means that after the first time the model is run, it will not
# be re-run unless the data, fitting options, or underlying Stan code change
stanDx(bpo)

Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)

Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.

Checking sampler transitions for divergences.
No divergent transitions found.

Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.

Effective sample size satisfactory.

Split R-hat values satisfactory all parameters.

Processing complete, no problems detected.

EBFMI: 0.734 0.781 0.756 0.765 

   Parameter  Rhat ESS bulk ESS tail
1   alpha[1] 1.001     2405     2387
2   alpha[2] 1.000     1892     2497
3   alpha[3] 1.001     1261     2140
4   alpha[4] 1.003     1121     1946
5   alpha[5] 1.004     1033     1755
6   alpha[6] 1.004      911     1640
7   alpha[7] 1.004      872     1454
8   alpha[8] 1.005      801     1474
9   alpha[9] 1.006      711     1200
10 alpha[10] 1.007      654     1118
11 alpha[11] 1.007      603     1091
12 alpha[12] 1.008      586     1060
13 alpha[13] 1.008      570     1066
14 alpha[14] 1.009      544      833
15 alpha[15] 1.010      495      847
16 alpha[16] 1.009      486      902
17 alpha[17] 1.010      471      870
18 alpha[18] 1.011      444      865
19 alpha[19] 1.012      433      921
20 alpha[20] 1.013      413      767
21 alpha[21] 1.013      412      809
22 alpha[22] 1.015      401      790
23 alpha[23] 1.016      381      650
24 alpha[24] 1.017      377      735
25 alpha[25] 1.017      368      701
26 alpha[26] 1.019      366      770
27 alpha[27] 1.019      362      739
28 alpha[28] 1.020      351      778
29 alpha[29] 1.021      326      685
30 alpha[30] 1.022      326      700
31 alpha[31] 1.022      323      769
32 alpha[32] 1.021      329      844
33 alpha[33] 1.021      329      774
34 alpha[34] 1.020      323      855
35 alpha[35] 1.021      323      920
36 alpha[36] 1.020      327      852
37 alpha[37] 1.019      339      913
38 alpha[38] 1.018      339      846
39 alpha[39] 1.017      367      920
40 alpha[40] 1.016      385     1225
41 alpha[41] 1.015      426     1099
42 alpha[42] 1.014      479     1208
43 alpha[43] 1.011      533     1252
44 alpha[44] 1.009      625     1432
45 alpha[45] 1.008      699     1519
46 alpha[46] 1.008      755     1548
47 alpha[47] 1.007      844     1762
48 alpha[48] 1.007      867     1972
49 alpha[49] 1.005      924     2209
50 alpha[50] 1.006      989     2249
51 alpha[51] 1.005      964     1667
52 alpha[52] 1.005     1118     1853
53 alpha[53] 1.002     1632     2759
54 alpha[54] 1.002     1678     2813
55 alpha[55] 1.001     1763     2517
56 alpha[56] 1.001     1814     2478
57 alpha[57] 1.001     1985     2910
58 alpha[58] 1.001     2008     2855
59 alpha[59] 1.001     2110     3055
60 alpha[60] 1.001     2338     2752
61 alpha[61] 1.001     2749     2795
62   beta[1] 1.007      851     1381
63   beta[2] 1.007      885     1484
64   beta[3] 1.002     2058     2597
65   beta[4] 1.000     3905     2942
66   beta[5] 1.004      811     1524
67   beta[6] 1.001      962     1270
68   beta[7] 1.008      845     1636
69   beta[8] 1.004      791     1452
70   beta[9] 1.003      920     1656
71  beta[10] 1.002      713     1262
72  beta[11] 1.001     4031     2891
73  beta[12] 1.003     3135     2952
74  beta[13] 1.000     3915     2691
75  beta[14] 1.000     3892     2965
76  beta[15] 1.002      937     1371
77  beta[16] 1.002     1052     1906
78  beta[17] 1.003     1070     1731
79 sigmag[1] 1.006      877     1627

Code

print(bpo, intercepts=TRUE)

Bayesian Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.044 for Intercepts

blrm(formula = twstrs ~ treat * rcs(week, 3) + rcs(twstrs0, 3) + 
    rcs(age, 4) * sex + cluster(uid), data = both, file = "bpo.rds")

	Mixed Calibration/ Discrimination Indexes	Discrimination Indexes	Rank Discrim. Indexes
Obs 522	LOO log L -1747.32±23.76	g 3.818 [3.356, 4.333]	C 0.793 [0.786, 0.8]
Draws 4000	LOO IC 3494.63±47.52	g_p 0.434 [0.42, 0.449]	D_xy 0.586 [0.571, 0.6]
Chains 4	Effective p 179.89±7.98	EV 0.591 [0.55, 0.64]
Time 8.6s	B 0.148 [0.138, 0.159]	v 11.348 [8.728, 14.4]
p 17		vp 0.147 [0.137, 0.16]
Cluster on `uid`
Clusters 108
σ_γ 1.8896 [1.5148, 2.2389]

	Mean β	Median β	S.E.	Lower	Upper	Pr(β>0)	Symmetry
y≥7	-1.7643	-1.8191	4.1397	-9.8076	6.6477	0.3298	1.03
y≥9	-2.7591	-2.7827	4.0084	-10.4194	5.4754	0.2478	1.02
y≥10	-3.9152	-3.8964	3.9664	-11.3557	4.3103	0.1638	1.00
y≥11	-4.3599	-4.3385	3.9541	-12.2745	3.3840	0.1355	1.01
y≥13	-4.5603	-4.5570	3.9504	-12.4616	3.1658	0.1242	0.98
y≥14	-4.9183	-4.9149	3.9440	-13.0071	2.6053	0.1040	1.00
y≥15	-5.2299	-5.2464	3.9453	-12.7782	2.8555	0.0897	1.00
y≥16	-5.6114	-5.6134	3.9419	-13.1912	2.4135	0.0768	1.01
y≥17	-6.4192	-6.3833	3.9395	-14.0149	1.4249	0.0515	1.00
y≥18	-6.6747	-6.6329	3.9378	-14.2070	1.3354	0.0450	1.00
y≥19	-6.9610	-6.9379	3.9386	-14.7029	0.8341	0.0370	1.01
y≥20	-7.1515	-7.1143	3.9379	-14.8077	0.7600	0.0345	1.00
y≥21	-7.3314	-7.2867	3.9357	-14.9624	0.6494	0.0312	1.00
y≥22	-7.7460	-7.7308	3.9378	-15.6243	-0.0046	0.0257	1.01
y≥23	-8.0128	-7.9779	3.9373	-15.9404	-0.2861	0.0230	1.01
y≥24	-8.2951	-8.2922	3.9366	-15.9181	-0.2859	0.0190	1.00
y≥25	-8.5535	-8.5436	3.9368	-16.3666	-0.7597	0.0163	1.01
y≥26	-8.9500	-8.9392	3.9333	-16.7288	-1.1311	0.0120	1.02
y≥27	-9.2420	-9.2257	3.9331	-16.9584	-1.3192	0.0092	1.02
y≥28	-9.4857	-9.4814	3.9370	-17.8059	-2.2669	0.0082	1.02
y≥29	-9.7202	-9.7021	3.9380	-18.1175	-2.5136	0.0075	1.02
y≥30	-10.0244	-10.0222	3.9405	-18.4072	-2.7735	0.0053	1.02
y≥31	-10.3159	-10.2895	3.9441	-18.7359	-3.0452	0.0032	1.02
y≥32	-10.4329	-10.4249	3.9464	-17.8489	-2.1703	0.0032	1.02
y≥33	-10.7979	-10.7900	3.9508	-18.3086	-2.6129	0.0020	1.01
y≥34	-11.1087	-11.1235	3.9545	-18.7751	-3.0810	0.0018	1.00
y≥35	-11.3317	-11.3339	3.9553	-18.8762	-3.1964	0.0018	1.00
y≥36	-11.5770	-11.5802	3.9589	-19.1316	-3.4700	0.0015	0.99
y≥37	-11.8543	-11.8519	3.9586	-19.4530	-3.7958	0.0010	1.00
y≥38	-12.0826	-12.0758	3.9601	-19.5851	-3.8891	0.0008	1.00
y≥39	-12.3284	-12.3181	3.9628	-19.9410	-4.2324	0.0005	1.00
y≥40	-12.5142	-12.5219	3.9649	-20.0481	-4.3824	0.0003	0.99
y≥41	-12.6976	-12.6997	3.9672	-20.2599	-4.6041	0.0003	1.00
y≥42	-13.0227	-13.0164	3.9700	-20.6949	-5.0878	0.0000	0.99
y≥43	-13.2493	-13.2417	3.9708	-21.0474	-5.3663	0.0000	0.99
y≥44	-13.5899	-13.5872	3.9727	-21.2676	-5.6247	0.0000	0.99
y≥45	-13.9086	-13.9078	3.9755	-21.6525	-5.9600	0.0000	1.00
y≥46	-14.2086	-14.1959	3.9779	-21.9574	-6.2619	0.0000	1.00
y≥47	-14.6295	-14.6177	3.9804	-22.4691	-6.7690	0.0000	1.00
y≥48	-14.9187	-14.9118	3.9807	-22.7184	-7.0087	0.0000	0.99
y≥49	-15.2947	-15.2854	3.9838	-23.1581	-7.4437	0.0000	1.00
y≥50	-15.6132	-15.5814	3.9841	-23.4506	-7.7813	0.0000	0.98
y≥51	-16.1419	-16.1142	3.9880	-24.2917	-8.5094	0.0000	0.99
y≥52	-16.5060	-16.4634	3.9929	-24.3192	-8.5218	0.0000	1.00
y≥53	-16.9501	-16.9331	3.9936	-24.8171	-9.0371	0.0000	0.98
y≥54	-17.4495	-17.4252	3.9985	-25.6569	-9.7787	0.0000	0.98
y≥55	-17.8605	-17.8315	3.9992	-26.0112	-10.1184	0.0000	0.97
y≥56	-18.1122	-18.0905	4.0011	-26.2675	-10.4180	0.0000	0.98
y≥57	-18.5818	-18.5633	4.0029	-26.6652	-10.7002	0.0000	0.97
y≥58	-19.1426	-19.1189	4.0064	-27.1417	-11.2577	0.0000	0.97
y≥59	-19.5003	-19.5092	4.0099	-27.7644	-11.8228	0.0000	0.98
y≥60	-19.8295	-19.8113	4.0187	-28.2442	-12.2644	0.0000	0.98
y≥61	-20.5247	-20.5121	4.0255	-28.4176	-12.3892	0.0000	0.97
y≥62	-20.9019	-20.9195	4.0317	-29.4047	-13.4506	0.0000	0.97
y≥63	-21.3154	-21.3012	4.0443	-29.9954	-13.9858	0.0000	0.97
y≥64	-21.4547	-21.4380	4.0491	-29.9423	-13.9528	0.0000	0.97
y≥65	-22.1844	-22.1752	4.0607	-30.4837	-14.5566	0.0000	0.97
y≥66	-22.5585	-22.5528	4.0586	-30.7448	-14.6592	0.0000	0.99
y≥67	-22.9734	-22.9445	4.0650	-30.9304	-14.8404	0.0000	0.96
y≥68	-23.7571	-23.7618	4.0799	-31.9658	-15.8370	0.0000	0.94
y≥71	-24.6133	-24.5255	4.1163	-32.6465	-16.5323	0.0000	0.92
treat=5000U	0.1320	0.1257	0.7176	-1.2622	1.4990	0.5675	0.97
treat=Placebo	2.3765	2.3599	0.7354	0.8557	3.7376	0.9998	0.99
week	0.1221	0.1226	0.0798	-0.0295	0.2831	0.9365	1.01
week'	0.1921	0.1918	0.0872	0.0181	0.3596	0.9858	1.00
twstrs0	0.2243	0.2243	0.0506	0.1286	0.3285	1.0000	1.03
twstrs0'	0.1334	0.1329	0.0627	0.0090	0.2560	0.9805	0.98
age	-0.0139	-0.0143	0.0777	-0.1712	0.1363	0.4225	0.99
age'	0.1876	0.1875	0.2131	-0.2368	0.6056	0.8172	0.98
age''	-1.0417	-1.0472	0.8420	-2.7096	0.6386	0.1008	1.02
sex=M	5.2846	5.2785	6.2037	-7.2013	17.2699	0.8035	1.00
treat=5000U × week	0.0502	0.0515	0.1090	-0.1555	0.2671	0.6780	1.02
treat=Placebo × week	-0.0547	-0.0539	0.1114	-0.2828	0.1576	0.3080	0.97
treat=5000U × week'	-0.1615	-0.1635	0.1194	-0.3874	0.0752	0.0897	1.01
treat=Placebo × week'	-0.1392	-0.1399	0.1213	-0.3850	0.0895	0.1190	1.02
age × sex=M	-0.1182	-0.1170	0.1490	-0.4043	0.1839	0.2110	1.00
age' × sex=M	0.1858	0.1895	0.4189	-0.6583	0.9833	0.6818	0.96
age'' × sex=M	-0.1081	-0.1145	1.6240	-3.3840	3.0220	0.4752	1.02

Code

a <- anova(bpo)
a

	REV	Lower	Upper	d.f.
Relative Explained Variation for `twstrs`. Approximate total model Wald χ² used in denominators of REV:250.9 [202.4, 334.3].
treat (Factor+Higher Order Factors)	0.124	0.067	0.220	6
All Interactions	0.088	0.033	0.167	4
week (Factor+Higher Order Factors)	0.586	0.470	0.689	6
All Interactions	0.088	0.033	0.167	4
Nonlinear (Factor+Higher Order Factors)	0.021	0.002	0.067	3
twstrs0	0.621	0.474	0.685	2
Nonlinear	0.018	0.000	0.053	1
age (Factor+Higher Order Factors)	0.026	0.015	0.096	6
All Interactions	0.015	0.002	0.061	3
Nonlinear (Factor+Higher Order Factors)	0.022	0.004	0.074	4
sex (Factor+Higher Order Factors)	0.019	0.003	0.072	4
All Interactions	0.015	0.002	0.061	3
treat × week (Factor+Higher Order Factors)	0.088	0.033	0.167	4
Nonlinear	0.008	0.000	0.041	2
Nonlinear Interaction : f(A,B) vs. AB	0.008	0.000	0.041	2
age × sex (Factor+Higher Order Factors)	0.015	0.002	0.061	3
Nonlinear	0.013	0.000	0.051	2
Nonlinear Interaction : f(A,B) vs. AB	0.013	0.000	0.051	2
TOTAL NONLINEAR	0.061	0.024	0.142	8
TOTAL INTERACTION	0.106	0.052	0.205	7
TOTAL NONLINEAR + INTERACTION	0.139	0.083	0.262	11
TOTAL	1.000	1.000	1.000	17

Code

plot(a)

Show the final graphic (high dose:placebo contrast as function of time
Intervals are 0.95 highest posterior density intervals
\(y\)-axis: log-odds ratio

Code

wks <- c(2,4,8,12,16)
k <- contrast(bpo, list(week=wks, treat='10000U'),
                   list(week=wks, treat='Placebo'),
              cnames=paste('Week', wks))
k

           week   Contrast      S.E.     Lower      Upper Pr(Contrast>0)
1  Week 2     2 -2.2671499 0.6056539 -3.389041 -1.0137092         0.0000
2  Week 4     4 -2.1577993 0.5404080 -3.207694 -1.0992801         0.0000
3  Week 8     8 -1.7998784 0.5926524 -2.979328 -0.6948248         0.0015
4* Week 12   12 -0.8850789 0.5348916 -1.943923  0.1461106         0.0498
5* Week 16   16  0.1689403 0.6118662 -1.034984  1.3610571         0.6065

Redundant contrasts are denoted by *

Intervals are 0.95 highest posterior density intervals
Contrast is the posterior mean

Code

plot(k)

Code

k <- as.data.frame(k[c('week', 'Contrast', 'Lower', 'Upper')])
ggplot(k, aes(x=week, y=Contrast)) + geom_point() +
  geom_line() + ylab('High Dose - Placebo') +
  geom_errorbar(aes(ymin=Lower, ymax=Upper), width=0)

For each posterior draw compute the difference in means and get an exact (to within simulation error) 0.95 highest posterior density intervals for these differences.

Code

M <- Mean(bpo)   # create R function that computes mean Y from X*beta
k <- contrast(bpo, list(week=wks, treat='10000U'),
                   list(week=wks, treat='Placebo'),
              fun=M, cnames=paste('Week', wks))
plot(k, which='diff') + theme(legend.position='bottom')

Code

f <- function(x) {
  hpd <- HPDint(x, prob=0.95)   # is in rmsb
  r <- c(mean(x), median(x), hpd)
  names(r) <- c('Mean', 'Median', 'Lower', 'Upper')
  r
}
w    <- as.data.frame(t(apply(k$esta - k$estb, 2, f)))
week <- as.numeric(sub('Week ', '', rownames(w)))
ggplot(w, aes(x=week, y=Mean)) + geom_point() +
  geom_line() + ylab('High Dose - Placebo') +
  geom_errorbar(aes(ymin=Lower, ymax=Upper), width=0) +
  scale_y_continuous(breaks=c(-8, -4, 0, 4))

7.8.4 Bayesian Markov Semiparametric Model

First-order Markov model
Serial correlation induced by Markov model is similar to AR(1) which we already know fits these data
Markov model is more likely to fit the data than the random effects model, which induces a compound symmetry correlation structure
Models state transitions
PO model at each visit, with Y from previous visit conditioned upon just like any covariate
Need to uncondition (marginalize) on previous Y to get the time-response profile we usually need
Semiparametric model is especially attractive because one can easily “uncondition” a discrete Y model, and the distribution of Y for control subjects can be any shape
Let measurement times be \(t_{1}, t_{2}, \dots, t_{m}\), and the measurement for a subject at time \(t\) be denoted \(Y(t)\)
First-order Markov model:

\[\begin{array}{ccc} \Pr(Y(t_{i}) \geq y | X, Y(t_{i-1})) &=& \mathrm{expit}(\alpha_{y} + X\beta\\ &+& g(Y(t_{i-1}), t_{i}, t_{i} - t_{i-1})) \end{array}\]

\(g\) involves any number of regression coefficients for a main effect of \(t\), the main effect of time gap \(t_{i} - t_{i-1}\) if this is not collinear with absolute time, a main effect of the previous state, and interactions between these
Examples of how the previous state may be modeled in \(g\):
- linear in numeric codes for \(Y\)
- spline function in same
- discontinuous bi-linear relationship where there is a slope for in-hospital outcome severity, a separate slope for outpatient outcome severity, and an intercept jump at the transition from inpatient to outpatient (or vice versa)
Markov model is quite flexible in handling time trends and serial correlation patterns
Can allow for irregular measurement times:
hbiostat.org/stat/irreg.html

Fit the model and run standard Stan diagnostics.

Code

# Create a new variable to hold previous value of Y for the subject
# For week 2, previous value is the baseline value
setDT(both, key=c('uid', 'week'))
both[, ptwstrs := shift(twstrs), by=uid]
both[week == 2, ptwstrs := twstrs0]
dd <- datadist(both)
bmark <- blrm(twstrs ~  treat * rcs(week, 3) + rcs(ptwstrs, 4) +
                        rcs(age, 4) * sex,
              data=both, file='bmark.rds')
# When adding partial PO terms for week and ptwstrs, z=-1.8, 5.04
stanDx(bmark)

Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)

Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.

Checking sampler transitions for divergences.
No divergent transitions found.

Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.

Effective sample size satisfactory.

Split R-hat values satisfactory all parameters.

Processing complete, no problems detected.

EBFMI: 0.992 1.015 1.008 1.053 

   Parameter  Rhat ESS bulk ESS tail
1   alpha[1] 1.000     5046     2585
2   alpha[2] 1.001     5943     2962
3   alpha[3] 1.002     5392     3438
4   alpha[4] 1.003     5189     3203
5   alpha[5] 1.004     4881     3514
6   alpha[6] 1.001     4432     3244
7   alpha[7] 1.001     4367     3169
8   alpha[8] 1.001     4301     3193
9   alpha[9] 1.000     4180     3431
10 alpha[10] 1.001     4089     3061
11 alpha[11] 1.001     3865     3544
12 alpha[12] 1.000     3892     3109
13 alpha[13] 1.000     3870     3006
14 alpha[14] 1.001     3740     3149
15 alpha[15] 1.000     3720     3200
16 alpha[16] 1.000     3696     2949
17 alpha[17] 1.000     3557     2853
18 alpha[18] 1.001     3501     2930
19 alpha[19] 1.001     3450     3208
20 alpha[20] 1.000     3455     3315
21 alpha[21] 1.001     3439     3230
22 alpha[22] 1.001     3591     3139
23 alpha[23] 1.001     3917     3079
24 alpha[24] 1.000     4194     3351
25 alpha[25] 1.000     4494     3332
26 alpha[26] 1.000     4846     3231
27 alpha[27] 1.000     5169     3513
28 alpha[28] 1.000     5562     2933
29 alpha[29] 1.001     5680     3382
30 alpha[30] 1.001     6010     3509
31 alpha[31] 1.000     6861     3463
32 alpha[32] 1.001     7130     3405
33 alpha[33] 1.003     7375     3439
34 alpha[34] 1.001     7851     3113
35 alpha[35] 1.000     8333     3186
36 alpha[36] 1.001     8867     3536
37 alpha[37] 1.000     8920     3326
38 alpha[38] 1.001     7790     3223
39 alpha[39] 1.001     7023     3140
40 alpha[40] 1.000     6703     3082
41 alpha[41] 1.000     6331     3250
42 alpha[42] 1.001     6189     3397
43 alpha[43] 1.000     6000     3207
44 alpha[44] 1.001     5643     3146
45 alpha[45] 1.001     5354     3536
46 alpha[46] 1.002     5013     3561
47 alpha[47] 1.000     4982     3748
48 alpha[48] 1.000     4759     3452
49 alpha[49] 1.000     4930     3753
50 alpha[50] 1.000     5015     3329
51 alpha[51] 1.001     5385     3800
52 alpha[52] 1.002     5234     3652
53 alpha[53] 1.000     5346     3839
54 alpha[54] 1.002     5401     3556
55 alpha[55] 1.001     5023     3087
56 alpha[56] 1.001     5159     3493
57 alpha[57] 1.000     4840     3461
58 alpha[58] 1.000     4860     3303
59 alpha[59] 1.000     4870     3270
60 alpha[60] 1.000     5511     3103
61 alpha[61] 1.000     5413     3483
62   beta[1] 1.002     9948     2570
63   beta[2] 1.003     9969     3036
64   beta[3] 1.000     6623     3309
65   beta[4] 1.001     9296     2963
66   beta[5] 1.001     3445     3216
67   beta[6] 1.000     6925     3400
68   beta[7] 1.001     8462     3182
69   beta[8] 1.000     9403     2843
70   beta[9] 1.003    10739     3184
71  beta[10] 1.002     8388     2922
72  beta[11] 1.000     9370     3317
73  beta[12] 1.002     9244     2859
74  beta[13] 1.001     9606     3155
75  beta[14] 1.003    10196     2921
76  beta[15] 1.005     9313     2963
77  beta[16] 1.001     9339     2759
78  beta[17] 1.000    10100     3188
79  beta[18] 1.001    10283     2822

Code

stanDxplot(bmark)

Note that posterior sampling is much more efficient without random effects.

Code

bmark

Bayesian Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.044 for Intercepts

blrm(formula = twstrs ~ treat * rcs(week, 3) + rcs(ptwstrs, 4) + 
    rcs(age, 4) * sex, data = both, file = "bmark.rds")

Frequencies of Missing Values Due to Each Variable

 twstrs   treat    week ptwstrs     age     sex 
      0       0       0       5       0       0

	Mixed Calibration/ Discrimination Indexes	Discrimination Indexes	Rank Discrim. Indexes
Obs 517	LOO log L -1784.43±22.18	g 3.262 [2.988, 3.514]	C 0.828 [0.825, 0.831]
Draws 4000	LOO IC 3568.87±44.36	g_p 0.416 [0.402, 0.426]	D_xy 0.656 [0.65, 0.661]
Chains 4	Effective p 88.31±4.5	EV 0.532 [0.493, 0.567]
Time 3.9s	B 0.117 [0.113, 0.121]	v 8.376 [7.102, 9.772]
p 18		vp 0.133 [0.123, 0.141]

	Mode β	Mean β	Median β	S.E.	Lower	Upper	Pr(β>0)	Symmetry
treat=5000U	0.2212	0.2114	0.2236	0.5698	-0.9052	1.2801	0.6470	1.00
treat=Placebo	1.8314	1.8345	1.8409	0.5606	0.7486	2.9367	0.9995	0.98
week	0.4865	0.4876	0.4870	0.0822	0.3363	0.6464	1.0000	1.01
week'	-0.2878	-0.2888	-0.2894	0.0874	-0.4492	-0.1178	0.0003	1.00
ptwstrs	0.1997	0.2015	0.2016	0.0268	0.1488	0.2535	1.0000	1.01
ptwstrs'	-0.0621	-0.0661	-0.0666	0.0634	-0.1897	0.0581	0.1482	1.01
ptwstrs''	0.5326	0.5491	0.5504	0.2542	0.0496	1.0479	0.9842	1.01
age	-0.0295	-0.0284	-0.0282	0.0315	-0.0937	0.0300	0.1802	1.03
age'	0.1236	0.1207	0.1212	0.0875	-0.0529	0.2872	0.9152	0.99
age''	-0.5068	-0.4963	-0.4952	0.3473	-1.2106	0.1593	0.0755	0.97
sex=M	-0.4628	-0.4084	-0.4312	2.3684	-4.8870	4.2842	0.4368	0.98
treat=5000U × week	-0.0341	-0.0323	-0.0337	0.1114	-0.2561	0.1760	0.3835	1.01
treat=Placebo × week	-0.2719	-0.2733	-0.2747	0.1099	-0.4916	-0.0644	0.0045	0.99
treat=5000U × week'	-0.0340	-0.0354	-0.0341	0.1213	-0.2745	0.1960	0.3852	0.99
treat=Placebo × week'	0.1197	0.1217	0.1221	0.1195	-0.1166	0.3439	0.8370	0.99
age × sex=M	0.0112	0.0099	0.0102	0.0571	-0.0990	0.1220	0.5680	1.01
age' × sex=M	-0.0509	-0.0479	-0.0464	0.1625	-0.3704	0.2563	0.3882	0.96
age'' × sex=M	0.2614	0.2520	0.2487	0.6332	-0.9494	1.4920	0.6475	1.03

Code

a <- anova(bmark)
a

	REV	Lower	Upper	d.f.
Relative Explained Variation for `twstrs`. Approximate total model Wald χ² used in denominators of REV:463.2 [399.6, 560.9].
treat (Factor+Higher Order Factors)	0.048	0.020	0.097	6
All Interactions	0.046	0.016	0.090	4
week (Factor+Higher Order Factors)	0.288	0.208	0.365	6
All Interactions	0.046	0.016	0.090	4
Nonlinear (Factor+Higher Order Factors)	0.063	0.025	0.110	3
ptwstrs	0.951	0.882	0.964	3
Nonlinear	0.038	0.012	0.073	2
age (Factor+Higher Order Factors)	0.008	0.002	0.042	6
All Interactions	0.001	0.000	0.018	3
Nonlinear (Factor+Higher Order Factors)	0.005	0.001	0.029	4
sex (Factor+Higher Order Factors)	0.001	0.000	0.022	4
All Interactions	0.001	0.000	0.018	3
treat × week (Factor+Higher Order Factors)	0.046	0.016	0.090	4
Nonlinear	0.004	0.000	0.020	2
Nonlinear Interaction : f(A,B) vs. AB	0.004	0.000	0.020	2
age × sex (Factor+Higher Order Factors)	0.001	0.000	0.018	3
Nonlinear	0.001	0.000	0.015	2
Nonlinear Interaction : f(A,B) vs. AB	0.001	0.000	0.015	2
TOTAL NONLINEAR	0.104	0.071	0.167	9
TOTAL INTERACTION	0.048	0.025	0.103	7
TOTAL NONLINEAR + INTERACTION	0.136	0.099	0.211	12
TOTAL	1.000	1.000	1.000	18

Code

plot(a)

Let’s add subject-level random effects to the model. Smallness of the standard deviation of the random effects provides support for the assumption of conditional independence that we like to make for Markov models and allows us to simplify the model by omitting random effects.

Code

bmarkre <- blrm(twstrs ~  treat * rcs(week, 3) + rcs(ptwstrs, 4) +
                          rcs(age, 4) * sex + cluster(uid),
                data=both, file='bmarkre.rds')
stanDx(bmarkre)

Iterations: 2000 on each of 4 chains, with 4000 posterior distribution samples saved

For each parameter, n_eff is a crude measure of effective sample size
and Rhat is the potential scale reduction factor on split chains
(at convergence, Rhat=1)

Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.

Checking sampler transitions for divergences.
No divergent transitions found.

Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.

Effective sample size satisfactory.

Split R-hat values satisfactory all parameters.

Processing complete, no problems detected.

EBFMI: 0.927 1.002 0.979 0.957 

   Parameter  Rhat ESS bulk ESS tail
1   alpha[1] 1.000     4242     2810
2   alpha[2] 1.001     4267     2723
3   alpha[3] 1.000     3125     2700
4   alpha[4] 1.001     2783     2753
5   alpha[5] 1.001     2678     2604
6   alpha[6] 1.001     2234     2747
7   alpha[7] 1.000     2029     2706
8   alpha[8] 1.000     2102     2615
9   alpha[9] 1.000     1839     2607
10 alpha[10] 1.000     1739     2569
11 alpha[11] 1.001     1691     2301
12 alpha[12] 1.001     1524     2106
13 alpha[13] 1.001     1504     1886
14 alpha[14] 1.001     1297     2054
15 alpha[15] 1.001     1290     1942
16 alpha[16] 1.001     1327     2112
17 alpha[17] 1.002     1297     2001
18 alpha[18] 1.002     1318     2163
19 alpha[19] 1.002     1290     1934
20 alpha[20] 1.002     1303     1960
21 alpha[21] 1.003     1256     1735
22 alpha[22] 1.002     1290     1874
23 alpha[23] 1.003     1403     1984
24 alpha[24] 1.003     1417     2057
25 alpha[25] 1.003     1520     2333
26 alpha[26] 1.003     1710     2380
27 alpha[27] 1.002     1874     2323
28 alpha[28] 1.001     1933     2676
29 alpha[29] 1.001     2141     2525
30 alpha[30] 1.001     2266     2434
31 alpha[31] 1.001     2547     2384
32 alpha[32] 1.000     3060     2782
33 alpha[33] 1.000     3192     2771
34 alpha[34] 1.001     3375     2588
35 alpha[35] 1.002     3672     3026
36 alpha[36] 1.001     4046     2397
37 alpha[37] 1.002     4082     2588
38 alpha[38] 1.002     4108     3012
39 alpha[39] 1.001     4078     2894
40 alpha[40] 1.000     4005     3188
41 alpha[41] 1.000     3670     2938
42 alpha[42] 1.002     3320     2738
43 alpha[43] 1.000     2905     2852
44 alpha[44] 1.001     2825     2698
45 alpha[45] 1.001     2814     2957
46 alpha[46] 1.002     2648     3131
47 alpha[47] 1.001     2591     2972
48 alpha[48] 1.001     2545     2942
49 alpha[49] 1.001     2488     2962
50 alpha[50] 1.001     2512     2885
51 alpha[51] 1.001     2586     2708
52 alpha[52] 1.001     2675     2949
53 alpha[53] 1.001     2889     2831
54 alpha[54] 1.001     2967     3113
55 alpha[55] 1.001     3184     3426
56 alpha[56] 1.001     3156     3189
57 alpha[57] 1.000     3191     3478
58 alpha[58] 1.000     3200     3214
59 alpha[59] 1.000     3499     3509
60 alpha[60] 1.000     3598     2822
61 alpha[61] 1.001     3360     2708
62   beta[1] 1.002     6027     2774
63   beta[2] 1.002     5402     3051
64   beta[3] 1.004     3585     2586
65   beta[4] 1.000     4191     2521
66   beta[5] 1.001     1447     2236
67   beta[6] 1.001     3766     3065
68   beta[7] 1.000     4799     3026
69   beta[8] 1.001     4480     2916
70   beta[9] 1.001     5948     2933
71  beta[10] 1.002     4991     2814
72  beta[11] 1.002     4919     2941
73  beta[12] 1.001     5787     2819
74  beta[13] 1.003     5552     2350
75  beta[14] 1.001     5321     2878
76  beta[15] 1.002     6622     2738
77  beta[16] 1.000     4907     2869
78  beta[17] 1.000     5241     3176
79  beta[18] 1.001     5019     3274
80 sigmag[1] 1.001     1322     1254

Code

bmarkre

Bayesian Proportional Odds Ordinal Logistic Model

Dirichlet Priors With Concentration Parameter 0.044 for Intercepts

blrm(formula = twstrs ~ treat * rcs(week, 3) + rcs(ptwstrs, 4) + 
    rcs(age, 4) * sex + cluster(uid), data = both, file = "bmarkre.rds")

Frequencies of Missing Values Due to Each Variable

      twstrs        treat         week      ptwstrs          age          sex 
           0            0            0            5            0            0 
cluster(uid) 
           0

	Mixed Calibration/ Discrimination Indexes	Discrimination Indexes	Rank Discrim. Indexes
Obs 517	LOO log L -1786.18±22.17	g 3.236 [2.937, 3.534]	C 0.828 [0.825, 0.831]
Draws 4000	LOO IC 3572.37±44.34	g_p 0.414 [0.402, 0.429]	D_xy 0.656 [0.65, 0.661]
Chains 4	Effective p 92.82±4.71	EV 0.529 [0.492, 0.572]
Time 4.9s	B 0.117 [0.114, 0.122]	v 8.246 [6.798, 9.864]
p 18		vp 0.132 [0.123, 0.143]
Cluster on `uid`
Clusters 108
σ_γ 0.1078 [1e-04, 0.3302]

	Mean β	Median β	S.E.	Lower	Upper	Pr(β>0)	Symmetry
treat=5000U	0.2080	0.2047	0.5697	-0.8533	1.3642	0.6412	0.98
treat=Placebo	1.8277	1.8314	0.5682	0.6666	2.9118	0.9995	0.96
week	0.4850	0.4841	0.0840	0.3176	0.6425	1.0000	1.01
week'	-0.2854	-0.2854	0.0891	-0.4584	-0.1147	0.0013	0.99
ptwstrs	0.2001	0.2001	0.0269	0.1443	0.2502	1.0000	1.01
ptwstrs'	-0.0650	-0.0648	0.0634	-0.1899	0.0554	0.1548	1.01
ptwstrs''	0.5453	0.5465	0.2541	0.0529	1.0254	0.9860	1.00
age	-0.0285	-0.0282	0.0321	-0.0928	0.0297	0.1930	0.98
age'	0.1219	0.1202	0.0880	-0.0348	0.3062	0.9230	0.96
age''	-0.5033	-0.4919	0.3444	-1.1761	0.1502	0.0688	1.02
sex=M	-0.4228	-0.4013	2.3330	-5.0837	4.1293	0.4338	0.95
treat=5000U × week	-0.0319	-0.0317	0.1113	-0.2533	0.1823	0.3858	0.98
treat=Placebo × week	-0.2707	-0.2712	0.1110	-0.4927	-0.0519	0.0088	1.01
treat=5000U × week'	-0.0359	-0.0360	0.1203	-0.2765	0.1895	0.3815	0.99
treat=Placebo × week'	0.1174	0.1162	0.1198	-0.1133	0.3584	0.8355	1.01
age × sex=M	0.0101	0.0090	0.0561	-0.1000	0.1213	0.5688	1.03
age' × sex=M	-0.0482	-0.0440	0.1582	-0.3421	0.2651	0.3875	0.99
age'' × sex=M	0.2527	0.2330	0.6127	-0.9254	1.3962	0.6510	1.03

The random effects SD is only 0.11 on the logit scale. Also, the standard deviations of all the regression parameter posterior distributions are virtually unchanged with the addition of random effects:

Code

plot(sqrt(diag(vcov(bmark))), sqrt(diag(vcov(bmarkre))),
     xlab='Posterior SDs in Conditional Independence Markov Model',
     ylab='Posterior SDs in Random Effects Markov Model')
abline(a=0, b=1, col=gray(0.85))

So we will use the model omitting random effects.

Show the partial effects of all the predictors, including the effect of the previous measurement of TWSTRS. Also compute high dose:placebo treatment contrasts on these conditional estimates.

Code

ggplot(Predict(bmark))

Code

ggplot(Predict(bmark, week, treat))

Code

k <- contrast(bmark, list(week=wks, treat='10000U'),
                     list(week=wks, treat='Placebo'),
              cnames=paste('Week', wks))
k

           week   Contrast      S.E.      Lower      Upper Pr(Contrast>0)
1  Week 2     2 -1.2877573 0.3780022 -2.0268384 -0.5602907         0.0003
2  Week 4     4 -0.7410596 0.2609097 -1.2544416 -0.2226208         0.0025
3  Week 8     8  0.2306728 0.3521520 -0.4439147  0.9198736         0.7340
4* Week 12   12  0.7157525 0.2596741  0.2031618  1.2216702         0.9980
5* Week 16   16  1.0791691 0.3999631  0.2630522  1.8174237         0.9962

Redundant contrasts are denoted by *

Intervals are 0.95 highest posterior density intervals
Contrast is the posterior mean

Code

plot(k)

Code

k <- as.data.frame(k[c('week', 'Contrast', 'Lower', 'Upper')])
ggplot(k, aes(x=week, y=Contrast)) + geom_point() +
  geom_line() + ylab('High Dose - Placebo') +
  geom_errorbar(aes(ymin=Lower, ymax=Upper), width=0)

Using posterior means for parameter values, compute the probability that at a given week twstrs will be \(\geq 40\) when at the previous visit it was 40. Also show the conditional mean twstrs when it was 40 at the previous visit.

Code

ex <- ExProb(bmark)
ex40 <- function(lp, ...) ex(lp, y=40, ...)
ggplot(Predict(bmark, week, treat, ptwstrs=40, fun=ex40))

Code

ggplot(Predict(bmark, week, treat, ptwstrs=40, fun=Mean(bmark)))

Semiparametric models provide not only estimates of tendencies of Y but also estimate the whole distribution of Y
Estimate the entire conditional distribution of Y at week 12 for high-dose patients having TWSTRS=42 at week 8
Other covariates set to median/mode
Use posterior mean of all the cell probabilities
Also show pointwise 0.95 highest posterior density intervals
To roughly approximate simultaneous confidence bands make the pointwise limits sum to 1 like the posterior means do

Code

# Get median/mode for covariates including ptwstrs (TWSTRS in previous visit)
d <- gendata(bmark)
d

   treat week ptwstrs age sex
1 10000U    8      42  56   F

Code

d$week <- 12
p <- predict(bmark, d, type='fitted.ind')   # defaults to posterior means
yvals <- as.numeric(sub('twstrs=', '', p$y))
lo <- p$Lower / sum(p$Lower)
hi <- p$Upper / sum(p$Upper)
plot(yvals, p$Mean, type='l', xlab='TWSTRS', ylab='',
     ylim=range(c(lo, hi)))
lines(yvals, lo, col=gray(0.8))
lines(yvals, hi, col=gray(0.8))

Repeat this showing the variation over 5 posterior draws

Code

p <- predict(bmark, d, type='fitted.ind', posterior.summary='all')
cols <- adjustcolor(1 : 10, 0.7)
for(i in 1 : 5) {
  if(i == 1) plot(yvals, p[i, 1, ], type='l', col=cols[1], xlab='TWSTRS', ylab='')
  else lines(yvals, p[i, 1, ], col=cols[i])
}

Turn to marginalized (unconditional on previous twstrs) quantities
Capitalize on PO model being a multinomial model, just with PO restrictions
Manipulations of conditional probabilities to get the unconditional probability that twstrs=y doesn’t need to know about PO
Compute all cell probabilities and use the law of total probability recursively \[\Pr(Y_{t} = y | X) = \sum_{j=1}^{k} \Pr(Y_{t} = y | X, Y_{t-1} = j) \Pr(Y_{t-1} = j | X)\]
predict.blrm method with type='fitted.ind' computes the needed conditional cell probabilities, optionally for all posterior draws at once
Easy to get highest posterior density intervals for derived parameters such as unconditional probabilities or unconditional means
Hmisc package soprobMarkovOrdm function (in version 4.6) computes an array of all the state occupancy probabilities for all the posterior draws

Code

# Baseline twstrs to 42 in d
# For each dose, get all the posterior draws for all state occupancy
# probabilities for all visit
ylev <- sort(unique(both$twstrs))
tlev <- c('Placebo', '10000U')
R <- list()
for(trt in tlev) {   # separately by treatment
  d$treat <- trt
  u <- soprobMarkovOrdm(bmark, d, wks, ylev,
                        tvarname='week', pvarname='ptwstrs')
  R[[trt]] <- u
}
dim(R[[1]])    # posterior draws x times x distinct twstrs values

[1] 4000    5   62

Code

# For each posterior draw, treatment, and week compute the mean TWSTRS
# Then compute posterior mean of means, and HPD interval
Rmean <- Rmeans <- list()
for(trt in tlev) {
  r <- R[[trt]]
  # Mean Y at each week and posterior draw (mean from a discrete distribution)
  m <- apply(r, 1:2, function(x) sum(ylev * x))
  Rmeans[[trt]] <- m
  # Posterior mean and median and HPD interval over draws
  u <- apply(m, 2, f)   # f defined above
  u <- rbind(week=as.numeric(colnames(u)), u)
  Rmean[[trt]] <- u
}
r <- lapply(Rmean, function(x) as.data.frame(t(x)))
for(trt in tlev) r[[trt]]$treat <- trt
r <- do.call(rbind, r)
ggplot(r, aes(x=week, y=Mean, color=treat)) + geom_line() +
  geom_ribbon(aes(ymin=Lower, ymax=Upper), alpha=0.2, linetype=0)

Use the same posterior draws of unconditional probabilities of all values of TWSTRS to get the posterior distribution of differences in mean TWSTRS between high and low dose

Code

Dif <- Rmeans$`10000U` - Rmeans$Placebo
dif <- as.data.frame(t(apply(Dif, 2, f)))
dif$week <- as.numeric(rownames(dif))
ggplot(dif, aes(x=week, y=Mean)) + geom_line() +
  geom_ribbon(aes(ymin=Lower, ymax=Upper), alpha=0.2, linetype=0) +
  ylab('High Dose - Placebo TWSTRS')

Get posterior mean of all cell probabilities estimates at week 12
Distribution of TWSTRS conditional high dose, median age, mode sex
Not conditional on week 8 value

Code

p <- R$`10000U`[, '12', ]   # 4000 x 62
pmean <- apply(p, 2, mean)
yvals <- as.numeric(names(pmean))
plot(yvals, pmean, type='l', xlab='TWSTRS', ylab='')

7.9 Study Questions

Section 7.2

When should one model the time-response profile using discrete time?

Section 7.3

What makes generalized least squares and mixed effect models relatively robust to non-completely-random dropouts?
What does the last observation carried forward method always violate?

Section 7.4

Which correlation structure do you expect to fit the data when there are rapid repetitions over a short time span? When the follow-up time span is very long?

Section 7.8

What can go wrong if many correlation structures are tested in one dataset?
In a longitudinal intervention study, what is the most typical comparison of interest? Is it best to borrow information in estimating this contrast?

Davis, C. S. (2002). Statistical Methods for the Analysis of Repeated Measurements. Springer.

Diggle, P. J., Heagerty, P., Liang, K.-Y., & Zeger, S. L. (2002). Analysis of Longitudinal Data (second). Oxford University Press.

Donohue, M. C., Langford, O., Insel, P. S., van Dyck, C. H., Petersen, R. C., Craft, S., Sethuraman, G., Raman, R., Aisen, P. S., & Initiative, F. the A. D. N. (n.d.). Natural cubic splines for the analysis of Alzheimer’s clinical trials. Pharmaceutical Statistics, n/a(n/a). https://doi.org/10.1002/pst.2285

Gardiner, J. C., Luo, Z., & Roman, L. A. (2009). Fixed effects, random effects and GEE: What are the differences? Stat Med, 28, 221–239.

nice comparison of models; econometrics; different use of the term "fixed effects model"

Goldstein, H. (1989). Restricted unbiased iterative generalized least-squares estimation. Biometrika, 76(3), 622–623.

derivation of REML

Gurka, M. J., Edwards, L. J., & Muller, K. E. (2011). Avoiding bias in mixed model inference for fixed effects. Stat Med, 30(22), 2696–2707. https://doi.org/10.1002/sim.4293

Hand, D., & Crowder, M. (1996). Practical Longitudinal Data Analysis. Chapman & Hall.

Kenward, M. G., White, I. R., & Carpener, J. R. (2010). Should baseline be a covariate or dependent variable in analyses of change from baseline in clinical trials? (Letter to the editor). Stat Med, 29, 1455–1456.

sharp rebuke of liu09sho

Keselman, H. J., Algina, J., Kowalchuk, R. K., & Wolfinger, R. D. (1998). A comparison of two approaches for selecting covariance structures in the analysis of repeated measurements. Comm Stat - Sim Comp, 27, 591–604.

use of AIC and BIC for selecting the covariance structure in repeated measurements;serial data;longitudinal data;when chosing from 11 covariance patterns, AIC selected the correct structure 0.47 of the time; BIC was correct in 0.35

Liang, K.-Y., & Zeger, S. L. (2000). Longitudinal data analysis of continuous and discrete responses for pre-post designs. Sankhyā, 62, 134–148.

makes an error in assuming the baseline variable will have the same univariate distribution as the response except for a shift;baseline may have for example a truncated distribution based on a trial’s inclusion criteria;if correlation between baseline and response is zero, ANCOVA will be twice as efficient as simple analysis of change scores;if correlation is one they may be equally efficient

Lindsey, J. K. (1997). Models for Repeated Measurements. Clarendon Press.

Liu, G. F., Lu, K., Mogg, R., Mallick, M., & Mehrotra, D. V. (2009). Should baseline be a covariate or dependent variable in analyses of change from baseline in clinical trials? Stat Med, 28, 2509–2530.

seems to miss several important points, such as the fact that the baseline variable is often part of the inclusion/exclusion criteria and so has a truncated distribution that is different from that of the follow-up measurements;sharp rebuke in ken10sho

Peters, S. A., Bots, M. L., den Ruijter, H. M., Palmer, M. K., Grobbee, D. E., Crouse, J. R., O’Leary, D. H., Evans, G. W., Raichlen, J. S., Moons, K. G., Koffijberg, H., & METEOR study group. (2012). Multiple imputation of missing repeated outcome measurements did not add to linear mixed-effects models. J Clin Epi, 65(6), 686–695. https://doi.org/10.1016/j.jclinepi.2011.11.012

Pinheiro, J. C., & Bates, D. M. (2000). Mixed-Effects Models in S and S-PLUS. Springer.

Potthoff, R. F., & Roy, S. N. (1964). A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika, 51, 313–326.

included an AR1 example

Senn, S. (2006). Change from baseline and analysis of covariance revisited. Stat Med, 25, 4334–4344.

shows that claims that in a 2-arm study it is not true that ANCOVA requires the population means at baseline to be identical;refutes some claims of lia00lon;problems with counterfactuals;temporal additivity ("amounts to supposing that despite the fact that groups are difference at baseline they would show the same evolution over time");causal additivity;is difficult to design trials for which simple analysis of change scores is unbiased, ANCOVA is biased, and a causal interpretation can be given;temporally and logically, a "baseline cannot be a \(<\)i\(>\)response\(<\)/i\(>\) to treatment", so baseline and response cannot be modeled in an integrated framework as Laird and Ware’s model has been used;"one should focus clearly on “outcomes” as being the only values that can be influenced by treatment and examine critically any schemes that assume that these are linked in some rigid and deterministic view to “baseline” values. An alternative tradition sees a baseline as being merely one of a number of measurements capable of improving predictions of outcomes and models it in this way.";"You cannot establish necessary conditions for an estimator to be valid by nominating a model and seeing what the model implies unless the model is universally agreed to be impeccable. On the contrary it is appropriate to start with the estimator and see what assumptions are implied by valid conclusions.";this is in distinction to lia00lon

Simpson, S. L., Edwards, L. J., Muller, K. E., Sen, P. K., & Styner, M. A. (2010). A linear exponent AR(1) family of correlation structures. Stat Med, 29, 1825–1838.

Venables, W. N., & Ripley, B. D. (2003). Modern Applied Statistics with S (Fourth). Springer-Verlag.

Verbeke, G., & Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. Springer.

7.1 Notation

7.2 Model Specification for Effects on \(E(Y)\)

7.2.1 Common Basis Functions

7.2.2 Model for Mean Profile

7.2.3 Model Specification for Treatment Comparisons

7.3 Modeling Within-Subject Dependence

7.4 Parameter Estimation Procedure

7.5 Common Correlation Structures

7.6 Checking Model Fit

7.7 R Software

7.8 Case Study

7.8.1 Graphical Exploration of Data

Model with \(Y_{i0}\) as Baseline Covariate

7.8.2 Using Generalized Least Squares

7.8.3 Bayesian Proportional Odds Random Effects Model

7.8.4 Bayesian Markov Semiparametric Model

7.9 Study Questions

7.7 `R` Software