17  Introduction to Survival Analysis

17.1 Background

  • Use when time to occurrence of event is important
  • Don’t just count events; event at 6m worse than event at 9y
  • Response called failure time, survival time, event time
  • Ex: time until CV death, light bulb failure, pregnancy, ECG abnormality during exercise
  • Allow for censoring
  • Ex: 5y f/u study; subject still alive at 5y has failure time \(5+\)
  • Length of f/u can vary
  • Even in a well-designed randomized clinical trial, survival modeling can allow one to
  1. Test for and describe interactions with treatment. Subgroup analyses can easily generate spurious results and they do not consider interacting factors in a dose-response manner. Once interactions are modeled, relative treatment benefits can be estimated (e.g., hazard ratios), and analyses can be done to determine if some patients are too sick or too well to have even a relative benefit.
  2. Understand prognostic factors (strength and shape).
  3. Model absolute clinical benefit. First, a model for the probability of surviving past time \(t\) is developed. Then differences in survival probabilities for patients on treatments A and B can be estimated. The differences will be due primarily to sickness (overall risk) of the patient and to treatment interactions.
  4. Understand time course of treatment effect. The period of maximum effect or period of any substantial effect can be estimated from a plot of relative effects of treatment over time.
  5. Gain power for testing treatment effects.
  6. Adjust for imbalances in treatment allocation.

17.2 Censoring, Delayed Entry, and Truncation

  • Left-censoring
  • Interval censoring
  • Left-truncation (unknown subset of subjects who failed before qualifying for the study)
  • Delayed entry (exposure after varying periods of survival)
  • Choice of time zero important
  • Take into account waiting time bias
  • Usually have random type I censoring (on duration, not # events)
  • Must usually have non-informative censoring:
    censoring independent of impending failure
  • Intention-to-treat is a preventative measure

17.3 Notation, Survival, and Hazard Functions

  • \(T\): response variable \[ S(t) = \Pr(T > t) = 1 - F(t) \]
Code
tt <- c(seq(.0001,.002,by=.001),seq(.002,.02,by=.001),
        seq(.02,1,by=.01))
# Note: bb,dd not stated in usual Weibull form
aa <- .5
bb <- -.5
cc <- 10
dd <- 4
cumhaz <- (aa/(bb+1))*tt^(bb + 1) + (cc/(dd+1))*tt^(dd + 1)
survival <- exp(-cumhaz)
hazard <- ifelse(tt>.001, aa*tt^bb + cc*tt^dd, NA)
ggplot(mapping=aes(x=tt, y=survival)) + geom_line() +
  xlab(expression(t)) + ylab('Survival Function')
Figure 17.1: Survival function
Code
ggplot(mapping=aes(x=tt, y=cumhaz)) + geom_line() +
  xlab(expression(t)) + ylab('Cumulative Hazard Function')
Figure 17.2: Cumulative hazard function
  • Hazard function (force of mortality; instantaneous event rate)
  • \(\approx \Pr(\)event will occur in small interval around \(t\) given has not occurred before \(t\))
  • Very useful for learning about mechanisms and forces of risk over time
Code
ggplot(mapping=aes(x=tt, y=hazard)) + geom_line() +
  xlab(expression(t)) + ylab('Hazard Function')
Figure 17.3: Hazard function

\[ \lambda(t) = \lim_{u \rightarrow 0} \frac{\Pr(t < T \leq t+u | T>t)}{u}, \] which using the law of conditional probability becomes

\[\begin{array}{ccc} \lambda (t) &=& \lim_{u \rightarrow 0} \frac{\Pr(t < T \leq t+u) / \Pr(T>t)}{u} \nonumber \\ &=& \lim_{u \rightarrow 0} \frac{[F(t+u)-F(t)]/u}{S(t)} \nonumber \\ &=& \frac{\partial F(t)/\partial t}{S(t)} \\ &=& \frac{f(t)}{S(t)} , \nonumber \end{array}\]

\[ \frac{\partial \log S(t)}{\partial t} = \frac{\partial S(t)/\partial t}{S(t)} = -\frac{f(t)}{S(t)}, \] \[ \lambda (t) = -\frac{\partial \log S(t)}{\partial t}, \] \[ \int_{0}^{t}\lambda(v)dv = -\log S(t) . \] \[ \Lambda(t) = -\log S(t) , \] \[ S(t) = \exp[-\Lambda(t)] . \] * Expected value of \(\Lambda(T) = 1\) \[ T_{q} = S^{-1}(1-q) . \] \[ T_{0.5} = S^{-1}(0.5) . \]

\[\begin{array}{ccc} T_{q} &=& \Lambda^{-1}[-\log(1-q)] ~\text{and as a special case,} \nonumber \\ T_{.5} &=& \Lambda^{-1}(\log 2) . \end{array}\]

\[ \mu = \int_{0}^{\infty}S(v)dv \] * Event time for subject \(i\): \(T_{i}\) * Censoring time: \(C_{i}\) * Event indicator:

\[\begin{array}{ccc} e_{i} &=& 1~~\text{if the event was observed}~~(T_{i} \leq C_{i}) , \nonumber \\ &=& 0~~\text{if the response was censored}~~(T_{i} > C_{i}) \end{array}\]
  • The observed response is \[ Y_{i} = \min(T_{i}, C_{i}) , \]
Figure 17.4: Some censored data. Circles denote events.

17.4 Homogeneous Failure Time Distributions

  • Exponential distribution: constant hazard
\[\begin{array}{ccc} \Lambda(t) &=& \lambda t {\rm \ \ and} \nonumber \\ S(t) &=& \exp(-\Lambda(t)) = \exp(-\lambda t) \end{array}\]

\[ T_{0.5} = \log(2)/\lambda \] * Weibull distribution

\[\begin{array}{ccc} \lambda(t) &=& \alpha \gamma t^{\gamma -1} \nonumber \\ \Lambda(t) &=& \alpha t^{\gamma} \\ S(t) &=& \exp(-\alpha t^{\gamma}) \nonumber \end{array}\]
Code
tt <- seq(1e-3, 1.2, length=100)
w <- NULL
a <- 1
for(b in c(.5, 1, 2, 4))
    w <- rbind(w, data.frame(gamma=b, tt=tt, y= a * b * tt ^ (b-1)))
ggplot(w, aes(x=tt, y=y, color=factor(gamma))) + geom_line() +
  xlab(expression(t)) + ylab('Hazard Function') +
    guides(color=guide_legend(title=expression(gamma))) +
  theme(legend.position='bottom')
Figure 17.5: Some Weibull hazard functions with \(\alpha=1\) and various values of \(\gamma\)

\[ T_{0.5} =[(\log 2)/\alpha]^{1/\gamma} \]

  • The restricted cubic spline hazard model with \(k\) knots is \[ \lambda_{k}(t) = a + bt + \sum_{j=1}^{k-2} \gamma_{j} w_{j}(t) \]

17.5 Nonparametric Estimation of \(S\) and \(\Lambda\)

17.5.1 Kaplan–Meier Estimator

  • No censoring \(\rightarrow\) \[ S_{n}(t) =[{\rm number\ of\ } T_{i} > t]/n \]
  • Kaplan–Meier (product-limit) estimator
Day No. Subjects at Risk Deaths Censored Cumulative Survival
12 100 1 0 \(99/100=.99\)
30 99 2 1 \(97/99 \times 99/100 = .97\)
60 96 0 3 \(96/96 \times .97 = .97\)
72 93 3 0 \(90/93 \times .97 = .94\)
. . . . .
. . . . .

\[ S_{{\rm KM}}(t) = \prod_{i:t_{i}<t}(1-d_{i}/n_{i}) \] * The Kaplan–Meier estimator of \(\Lambda(t)\) is \(\Lambda_\text{KM}(t) = -\log S_{{\rm KM}}(t)\). * Simple example

\[\begin{array}{c} 1 \ \ 3\ \ 3\ \ 6^{+}\ \ 8^{+}\ \ 9\ \ 10^{+} . \nonumber \end{array}\]
\(i\) \(t_{i}\) \(n_{i}\) \(d_{i}\) \((n_{i}-d_{i})/n_{i}\)
1 1 7 1 6/7
2 3 6 2 4/6
3 9 2 1 1/2
\[\begin{array}{ccc} S_{\text{KM}}(t) &=& 1, ~~~~0 \le t<1 \nonumber \\ &=& 6/7=.85,~~1 \le t<3 \nonumber \\ &=& (6/7)(4/6)=.57,~~ 3 \le t<9 \\ &=& (6/7)(4/6)(1/2)=.29,~~9 \le t<10 . \nonumber \end{array}\]
Code
require(rms)
spar(bty='l')
tt <- c(1,3,3,6,8,9,10)
stat <- c(1,1,1,0,0,1,0)
S <- Surv(tt, stat)
survplot(npsurv(S ~ 1), conf="bands", n.risk=TRUE, xlab="t")
survplot(npsurv(S ~ 1, type="fleming-harrington", conf.int=FALSE),
         add=TRUE, lty=3)
Figure 17.6: Kaplan–Meier product-limit estimator with 0.95 confidence bands. The Altschuler–Nelson–Aalen–Fleming–Harrington estimator is depicted with the dashed lines.

\[ \text{Var}(\log \Lambda_{{\rm KM}}(t)) = \frac{\sum_{i:t_{i} < t} d_{i} / [n_{i}(n_{i}-d_{i})]} {\{\sum_{i:t_{i}<t} \log[(n_{i}-d_{i})/n_{i}]\}^{2}} \] \[ S_{\text{KM}}(t)^{\exp(\pm zs)} \]

17.5.2 Altschuler–Nelson Estimator

\[\begin{array}{ccc} \hat{\Lambda}(t) &=& \sum_{i:t_{i}<t}\frac{d_{i}}{n_{i}} \nonumber \\ S_{\Lambda}(t) &=& \exp(-\hat{\Lambda}(t)) \end{array}\]

17.6 Analysis of Multiple Endpoints

  • Cancer trial: recurrence of disease or death
  • CV trial: nonfatal MI or death
  • Analyze usual way but watch out for differing risk factors
  • Analyzing multiple causes of terminating event \(\rightarrow\)
    • Cause-specific hazards, censor on cause not currently analyzed
    • Not assume mechanism for cause removal or correlations of causes
    • Problem if apply to a setting where causes are removed differently
  • More complex if explicitly handle mixture of nonfatal outcomes with fatal outcome

17.6.1 Competing Risks

  • Events independent \(\rightarrow\) analyze separately, censoring others
  • \(\rightarrow\) unbiased estimate of cause-specific \(\lambda(t)\) or \(S(t)\) since censoring non-informative
  • \(1 - S_{\text{KM}}(t)\) estimates Pr(failing from the event in absence of other events)
  • Method can work with dependent causes but interpretation difficult
  • See Larson & Dinse (1985) for joint model of time until any failure and the type of failure
  • See Duc & Wolbers (n.d.) for an interesting smooth semi-nonparametric cumulative incidence estimator under competing risks, incorporating a joint distribution of event time and type

17.6.2 Competing Dependent Risks

  • Ordinary K-M estimator biased
  • Suppose cause \(m\) is of interest
  • Cause-specific hazard function:

\[\lambda_{m}(t) = \lim_{u \rightarrow 0} \frac{\Pr(\text{fail from cause}~m~ \text{in}~ [t, t+u) | \text{alive at} ~t)}{u}\]

  • The cumulative incidence function or probability of failure from cause \(m\) by time \(t\) is given by

\[ F_{m}(t) = \int_{0}^{t} \lambda_{m}(u)S(u)du \]

where \(S(u)\) is the probability of surviving (ignoring cause of death), which equals \(\exp(-\int_{0}^{u}(\sum \lambda_{m}(x))dx)\)

  • Note that interpretation of cumulative incidence in the presence of competing risks can be difficult, e.g., probability of having a heart attack that precedes death
  • \(1-F_{m}(t) = \exp(-\int_{0}^{t}\lambda_{m}(u)du)\) only if failures due to other causes are eliminated and if the cause-specific hazard of interest remains unchanged in doing so.
  • Nonparametric estimator of \(F_{m}(t)\):

\[ \hat{F}_{m}(t) = \sum_{i:t_{i}\leq t}\frac{d_{mi}}{n_{i}} S_{\text{KM}}(t_{i-1}) \]

where \(d_{mi}\) is the number of failures of type \(m\) at time \(t_i\) and \(n_i\) is the number of subjects at risk of failure at \(t_i\).

  • Margaret S. Pepe (1991), M. S. Pepe et al. (1991), Margaret S. Pepe & Mori (1993) show how to use combo of K-M estimators to estimate Pr(free of event 1 by time \(t\) given event 2 not occur by \(t\))

  • Suppose event 1 not a terminating event (e.g., death); even after event 1 subjects followed to find occurrences of event 2

  • \(\Pr(T_{1} > t | T_{2} > t)\):

\[\begin{array}{ccc} \Pr(T_{1} > t | T_{2} > t) &=& \frac{\Pr(T_{1} > t ~\text{and} T_{2} > t)}{\Pr(T_{2} > t)} \nonumber \\ &=& \frac{S_{12}(t)}{S_{2}(t)}, \end{array}\]

where \(S_{12}(t)\) is the survival function for \(\min(T_{1},T_{2})\)

  • Estimate of Pr(subject still alive at \(t\) will be free of MI at \(t\)):
    \(S_{\text{KM}_{12}} / S_{\text{KM}_{2}}\)

  • Can also easily compute Pr(event 1 occurs by time \(t\) and that event 2 has not occurred by \(t\)) \(\rightarrow\)
    \(S_{2}(t) - S_{12}(t) = [1 - S_{12}(t)] - [1 - S_{2}(t)]\)

  • Crude survival functions come from marginal distributions, i.e. \(\Pr(T_{1} > t)\) whether or not event 2 occurs)

\[\begin{array}{ccc} S_{c}(t) &=& 1 - F_{1}(t) \nonumber \\ F_{1}(t) &=& \Pr(T_{1} \leq t) \end{array}\]
  • \(F_{1}(t)\) is crude incidence function
  • \(T_{1} < t \rightarrow\) occurrence of event 1 is part of the prob. being computed
  • Event 2 terminating \(\rightarrow\) some subjects can never suffer event 1 \(\rightarrow\)
    crude survival fctn. for \(T_{1}\) never drops to zero
  • Crude survival fctn interpreted as surv. dist. of \(W\) where \(W=T_{1}\) if \(T_{1} < T_{2}\) and \(W=\infty\) otherwise

17.6.3 State Transitions and Multiple Types of Nonfatal Events

Multiple live states, one absorbing state (all-cause mortality)

  • Strauss & Shavelle (1998) extended Kaplan–Meier estimator
  • Estimate \(\pi_{ij}(t_{1}, t_{2})\) that subject in state \(i\) at time \(t_1\) is in state \(j\) \(t_2\) time units later
  • Define \(S_{KM}^{i}(t | t_{1})\) = Kaplan–Meier estimate of Pr(surviving \(t\) additional years for cohort of subjects beginning follow-up at time \(t_{1}\) in state \(i\))
  • Then

\[ \pi_{ij}(t_{1},t_{2}) = \frac{n_{ij}(t_{1},t_{2})}{n_{i}(t_{1},t_{2})} S_{KM}^{i}(t_{2} | t_{1}) \]

where \(n_{i}(t_{1},t_{2})\) is the number of subjects in live state \(i\) at time \(t_{1}\) who are alive and uncensored \(t_{2}\) time units later, and \(n_{ij}(t_{1},t_{2})\) is the number of such subjects in state \(j\) \(t_{2}\) time units beyond \(t_{1}\).

17.6.4 Joint Analysis of Time and Severity of an Event

  • Can give more weight to an event that occurs earlier or is more severe
  • Berridge & Whitehead (1991)
  • Ordinal scale for severity
  • Severity measured at time of (single) event
  • Ex: time until first headache / severity of headache
  • “joint hazard function”, ordinal category \(j\)

\[ \lambda_{j}(t) = \lambda(t) \pi_{j}(t), \]

  • Allows shift in dist. of response severity as \(t \uparrow\)

17.6.5 Analysis of Multiple Events

  • Ex: MI, ulcer, pregnancy, infection
  • Analysis of time to first event may lose info
  • Specialized multivariate failure time methods exist
  • Simpler to model marginal distributions
  • One record per event per subject
  • Can have # previous events as covariable
  • Correct variances for intra-subject correlation using clustered sandwich estimator
  • Method can handle multiple events of differing types

17.6.6 Generalization

Longitudinal ordinal models based on a discrete time state transition Markov process provide a unified framework for most of the contexts discussed in this chapter. See the last chapter of these RMS course notes and also this.

17.7 R Functions

  • event.chart in Hmisc draws a variety of charts for displaying raw survival time data, for both single and multiple events per subject (see also event.history, event.chart2)
  • Analyses in this chapter can be done as special cases of the Cox model
  • Particular functions for this chapter (no covariables) from Therneau:
  • Surv function: Combines time to event variable and event/censoring indicator into a single survival time matrix object
  • Right censoring: Surv(y, event); event is event/censoring indicator, usually coded TRUE/FALSE, 0=censored 1=event or 1=censored 2=event. If the event status variable has other coding, e.g., 3 means death, use Surv(y, s==3).
  • survfit: Kaplan–Meier and other nonparametric survival curves using the survival package
  • npsurv: rms package wrapper for survfit
Code
units(y) <- "Month"  # Default is "Day" - used for axis labels, etc.
survfit(Surv(y, event) ~ svar1 + svar2 + ... , data, subset,
        na.action=na.delete,
        type=c("kaplan-meier", "fleming-harrington", "fh2"),
        error=c("greenwood", "tsiatis"), se.fit=TRUE,
        conf.int=.95,
        conf.type=c("log", "plain", "log-log"))

If there are no stratification variables (svar1, …), omit them. To print a table of estimates, use

Code
f <- survfit(...)
print(f)    # print brief summary of f
summary(f, times, censored=FALSE)

For failure times stored in days, use

Code
f <- survfit(Surv(futime, event) ~ sex)
summary(f, seq(30,180,by=30))

to print monthly estimates.

To plot the object returned by survfit, use

Code
plot(f, conf.int=TRUE, mark.time=TRUE, mark=3, col=1, lty=1,
     lwd=1, cex=1, log=FALSE, xscale=1, yscale=1,
     xlab="", ylab="", xaxs="S", ...)

This invokes plot.survfit. More options: use npsurv and survplot in rms for other options that include automatic curve labeling and showing the number of subjects at risk at selected times. See code for Figure 17.6 above.

Stratified estimates, with four treatments distinguished by line type and curve labels, could be drawn by

Code
require(rms)
units(y) <- "Year"
f <- npsurv(Surv(y, stat) ~ treatment)
survplot(f, ylab="Fraction Pain-Free")
  • bootkm function in Hmisc bootstraps Kaplan–Meier survival estimates or Kaplan–Meier estimates of quantiles of the survival time distribution. It is easy to use bootkm to compute for example a nonparametric confidence interval for the ratio of median survival times for two groups.