DSMB Report for EXAMPLE Trial
knitrSet(lang='markdown', fig.path='figure/')
options(prType='html')
mu <- markupSpecs$html # in Hmisc - HTML markups
frac <- mu$frac
## Generate test data
set.seed(1)
n <- 500
d <- data.frame(country=sample(c('US', 'Canada', 'Spain', 'France',
'Germany'), n, TRUE),
site=sample(1:10, n, TRUE))
d$site <- paste(substring(d$country, 1, 2), d$site, sep='')
d$region <- factor(ifelse(d$country %in% c('US', 'Canada'),
'North America', 'Europe'))
d <- upData(d, edate = as.Date('2005-01-01') +
round(rgamma(n, 2, .01)) - 600 * (country == 'US'),
rdate = edate + round(runif(n, 1, 30)), print=FALSE)
d$rdate[runif(nrow(d)) < 0.5] <- NA # non-randomized subjects )
# with(d, table(region, country))
# For US manually compute # randomized per month
us <- subset(d, country == 'US')
site <- us$site
ed <- us$edate
rd <- us$rdate
months <- difftime(as.Date('2007-12-31'), ed, units='days') /
(365.25 / 12)
m <- max(months)
a <- sum(!is.na(rd)) / as.numeric(m) # .8545774 (agrees with chart)
# Compute maximum months elapsed for each site then sum over sites
maxpersite <- tapply(months, site, max)
b <- sum(!is.na(rd)) / sum(maxpersite)
## 0.0864429 = 47 / 543.6715 chart: .08645 (rounded)
## Suppose there are more subjects enrolled and randomized than really
## made their way into the dataset
denom <- c(enrolled=nrow(d) * 1.1,
randomized=sum(!is.na(d$rdate)) + 10)
sethreportOption(tx.var='treat', denom=denom)
## Initialize file to hold appendix information such as subject IDs
## so all later writing to this file can use append=TRUE
Introduction
Interactive Graphs
Most of the graphs produced here are semi-interactive. One can hover over elements of graphs with the mouse to have detailed information pop up.
Survival Curves
Graphs containing pairs of Kaplan-Meier survival curves show a shaded region centered at the midpoint of the two survival estimates and having a height equal to the half-width of the approximate 0.95 pointwise confidence interval for the difference of the two survival probabilities. Time points at which the two survival estimates do not touch the shaded region denote approximately significantly different survival estimates, without any multiplicity correction.
Accrual
accrualReport(enroll(edate) + randomize(rdate) ~
region(region) + country(country) + site(site),
data=d,
dateRange=c('2005-01-01', '2007-12-31'),
targetN=
data.frame(edate=c(500, 1000), rdate=c(250, 500)),
targetDate=c('2006-01-01', '2007-12-31'),
closeDate='2007-12-31')
Number | Category |
---|---|
5 | Countries |
50 | Sites |
500 | Participants enrolled |
227 | Participants randomized |
4.5 | Participants per site |
50 | Sites randomizing |
4.5 | Subjects randomized per randomizing site |
54.6 | Months from first subject randomized (2003-06-12) to 2007-12-31 |
1838.5 | Site-months for sites randomizing |
36.8 | Average months since a site first randomized |
0.12 | Participants randomized per site per month |
15.4 | Mean days from enrollment to randomization |
16 | Median days from enrollment to randomization |
∟ Participants enrolled over time
The blue line depicts the cumulative frequency. The thick grayscale line represent targets. |
|
∟ Participants randomized over time
The blue line depicts the cumulative frequency. The thick grayscale line represent targets. |
|
∟ Days from enrollment to randomization
Quartiles and mean number of days by region and country |
|
∟ Number of sites × number of participants
Number of sites having the given number of participants |
|
∟ Participants enrolled by region and country
∟ Participants randomized by region and country
∟ Sites that enrolled by region and country
∟ Sites that randomized by region and country
∟ Fraction of enrolled participants randomized by region and country
∟ Participants randomized per month by region and country
∟ Partipants randomized per site per month by region and country
Exclusions
f <- 0.05
d <- upData(d,
subjid = 1 : n,
pend = rbinom(n, 1, .1),
e1 = rbinom(n, 1, f),
e2 = rbinom(n, 1, f),
e3 = rbinom(n, 1, f),
e4 = ifelse(runif(n) < 0.25, NA, rbinom(n, 1, .10)),
tested = rbinom(n, 1, .75),
e5 = ifelse(tested, rbinom(n, 1, .04), NA),
e6 = rbinom(n, 1, f),
e7 = rbinom(n, 1, f),
rndz = rbinom(n, 1, .75),
labels=c(e1='Prior MI', e2='History of Asthma',
e3='History of Upper GI Bleeding',
e4='No Significant CAD', e5='Inadequate Renal Function',
e6='Pneumonia within 6 weeks', e7='Prior cardiac surgery'),
print=FALSE)
erd <- data.frame(subjid = 1 : 50,
loc = sample(c('gastric', 'lung', 'trachea'), 50, TRUE))
# To check warning messages, greportOption denom does not match pend, e1-e7
options(dumpfile='/tmp/z') # for seeing frequencies of all combinations
exReport(~ pending(pend) + e1 + e2 + e3 + e4 + e5 + e6 + e7 +
randomized(rndz) + id(subjid) + cond(e5, 'Tested', tested),
erdata = erd,
whenapp= c(e4='CCTA done'), data=d) #, hc=3.75, h=4)
∟ All combinations of exclusions of non-randomized participants
All combinations of exclusions occurring in non-randomized participants. 9 pending participants were excluded from numerators.∟ Cumulative exclusions
Cumulative number of exclusions (\(y\)-axis) and number of additional exclusions after exclusions placed higher, for participants not actually randomized. Exclusions are sorted by descending number of incremental exclusions. 550 participants were enrolled, 9 non-excluded participants are pending randomization, and 39 participants were excluded. 367 participants were randomized. Note: Number of observations (500) does not equal number officially enrolled (550). Note: Number of enrolled (491) minus number excluded (39) does not match official number randomized (237).◫ Exclusions
Incremental exclusions are those in addition to exclusions in earlier rows. Marginal exclusions are numbers of participants excluded for the indicated reason whether or not she was excluded for other reasons. The three Fractions are based on incremental exclusions.Exclusions |
Incremental Exclusions |
Marginal Exclusions |
Fraction of Enrolled |
Fraction of Exclusions |
Fraction Remaining |
---|---|---|---|---|---|
No Significant CAD (CCTA done, n=383) | 11 | 11 | 0.022 | 0.282 | 0.978 |
Pneumonia within 6 Weeks | 8 | 8 | 0.016 | 0.205 | 0.961 |
History of Upper GI Bleeding | 7 | 8 | 0.014 | 0.179 | 0.947 |
Prior Cardiac Surgery | 5 | 7 | 0.010 | 0.128 | 0.937 |
Prior MI | 4 | 4 | 0.008 | 0.103 | 0.929 |
History of Asthma | 4 | 4 | 0.008 | 0.103 | 0.921 |
Total | 39 | 0.079 | 1.000 | 0.921 |
◫ Exclusions in randomized participants
Frequency of exclusions for participants marked as randomizedExclusion | Frequency |
---|---|
Prior MI | 19 |
History of Asthma | 25 |
History of Upper GI Bleeding | 23 |
No Significant CAD | 26 |
Inadequate Renal Function | 10 |
Pneumonia within 6 Weeks | 20 |
Prior Cardiac Surgery | 11 |
Total Partcipants with Any Exclusion | 114 |
Click to show participant IDs
Exclusion
IDs
Prior MI
12, 27, 29, 33, 50, 86, 92, 111, 142, 160, 178, 213, 227, 294, 307, 385, 427, 482, 489
History of Asthma
1, 3, 29, 32, 73, 86, 92, 101, 130, 148, 167, 198, 204, 250, 272, 275, 282, 295, 332, 371, 414, 427, 432, 466, 483
History of Upper GI Bleeding
10, 14, 36, 65, 72, 86, 108, 110, 121, 176, 246, 266, 317, 328, 360, 363, 374, 385, 424, 444, 472, 484, 492
No Significant CAD
1, 8, 10, 32, 81, 85, 88, 178, 182, 184, 187, 211, 247, 258, 292, 305, 311, 313, 314, 328, 339, 353, 383, 412, 419, 426
Inadequate Renal Function
22, 47, 82, 131, 187, 195, 385, 432, 438, 444
Pneumonia within 6 Weeks
150, 156, 165, 222, 235, 270, 299, 301, 342, 352, 353, 387, 393, 439, 442, 445, 447, 468, 473, 477
Prior Cardiac Surgery
1, 21, 36, 97, 280, 294, 303, 385, 400, 411, 475
Click to see more information about those participants
subjid
loc
1
lung
3
lung
8
lung
10
gastric
12
gastric
14
trachea
21
trachea
22
gastric
27
lung
29
lung
32
trachea
33
trachea
36
lung
47
trachea
50
trachea
Baseline Variables
n <- 100
f <- function(na=FALSE) {
x <- sample(c('N', 'Y'), n, TRUE)
if(na) x[runif(100) < .1] <- NA
x
}
set.seed(1)
outs <- c('home', 'hospitalized', 'organ support', 'dead')
d <- data.frame(x1=f(), x2=f(), x3=f(), x4=f(), x5=f(), x6=f(),
x7=f(TRUE),
age=rnorm(n, 50, 10),
sbp=rnorm(n, 120, 7),
dbp=rnorm(n, 80, 6),
days=sample(1:n, n, TRUE),
race=sample(c('Asian', 'Black/AA', 'White'), n, TRUE),
sex=sample(c('Female', 'Male'), n, TRUE),
treat=sample(c('A', 'B'), n, TRUE),
region=sample(c('North America','Europe'), n, TRUE),
meda=sample(0:1, n, TRUE),
medb=sample(0:1, n, TRUE),
outcome=factor(sample(1:4, n, TRUE), 1:4, outs),
subjid=1:n)
d$days[1] <- NA
d$race[1:10] <- NA
d <- upData(d, labels=c(x1='MI', x2='Stroke', x3='AKI', x4='Migraines',
x5='Pregnant', x6='Other event', x7='MD withdrawal',
race='Race', sex='Sex', treat='treatment',
sbp='Systolic BP', days='Time Since Randomization',
meda='Medication A', medb='Medication B',
outcome='Ordinal Outcome'),
units=c(sbp='mmHg', dbp='mmHg', age='years', days='days'),
print=FALSE)
dasna <- subset(d, region=='North America')
# with(dasna, table(race, treat))
den <- c(enrolled=n + 50, randomized=n, table(d$treat))
sethreportOption(denom=den, tx.var='treat')
dReport(race + sex +
ynbind(x1, x2, x3, x4, x5, x6, x7, label='Exclusions') ~ 1,
head='Overall frequencies of categorical demographic variables and exclusions',
data=d, w=4, h=4.5)
∟ Overall frequencies of categorical demographic variables and exclusions
Overall frequencies of categorical demographic variables and exclusions. N=90 to 100 |
|
|
dReport(race + sex ~ treat,
groups='treat',
head='Categorical demographic variables',
data=d, w=4, h=4.5)
∟ Categorical demographic variables stratified by treatment
Categorical demographic variables stratified by treatment. N=90 to 100 |
|
|
∟ Demographics stratified by region
Demographics stratified by region. N=90 to 100 |
|
|
dReport(race + sex ~ region + treat, data=addMarginal(d, region),
groups='treat',
w=4.75, h=3.75,
head='Demographics')
∟ Demographics stratified by region and treatment
Demographics stratified by region and treatment. N=90 to 100 |
|
|
## Add a new block of variables that apply only to males
dReport(race + sex +
pBlock(race, subset=sex=='Male', label='Race: Males') ~ region,
data=d, groups='region',
head='Demographics with race for males')
∟ Demographics with race for males stratified by region
Demographics with race for males stratified by region. N=39 to 100 |
|
|
# Show raw data and smoothed relationship between age and sbp, stratified.
dReport(sbp ~ age + treat, groups='treat', data=d,
popts=list(fitter=loess, showpts=TRUE), what='xy')
∟ Systolic BP vs. age stratified by treatment
Systolic BP vs. age stratified by treatment. N=100 |
|
|
# Show ECDFs of sbp and age stratified by treatment+reference lines at quartiles
dReport(sbp + age ~ treat, groups='treat', what='ecdf', data=d,
sopts=list(q=(1:3)/4, ncols=2, width=750))
∟ Empirical cumultive distribution functions for systolic BP and age stratified by treatment
Empirical cumultive distribution functions for systolic BP and age stratified by treatment. N=100 |
|
|
# Show means and confidence intervals by treatment as dot chart
f <- function(x) {
x <- x[! is.na(x)]
c(smean.cl.normal(x, na.rm=FALSE), n=length(x))
}
dReport(sbp ~ treat, data=d,
fun = f, head='Mean and confidence limits')
Ordinal Outcome Summary
∟ Stacked bar chart of proportions for ordinal outcome stratified by treatment and sex
Stacked bar chart of proportions for ordinal outcome stratified by treatment and sex. N=100 |
|
|
Medication Usage Over Time
dReport(meda + medb ~ days + treat, what='xy',
groups='treat', data=d,
popts=list(fitter=loess, xlim=c(0, 130), ylim=c(0, 1), width=750),
head='Medication usage',
tail='Tick marks indicate mean measurement times within intervals.')
∟ Medication usage stratified by treatment
Medication usage stratified by treatment. N=100. Tick marks indicate mean measurement times within intervals. |
|
|
# Show number being followed as days since randomization gets larger
# make sure nriskReport doesn't get fooled by duplicate data
d2 <- rbind(d, d)
require(data.table)
nriskReport(days ~ region + id(subjid),
data=addMarginal(d2, region),
head='Number of subjects followed for medication usage')
∟ Number of subjects followed for medication usage stratified by region
Number of subjects followed for medication usage stratified by region |
|
|
# Make up some new visits to have more than 1/subj.
# Make up a new definition of time zero
d2$days[(n + 1) : (2 * n)] <- sample(1 : n, n, TRUE)
nriskReport(days ~ treat + id(subjid), data=d2, time0='PCI')
∟ Number of participants followed at least x days from PCI stratified by treatment
Number of participants followed at least x days from PCI stratified by treatment |
|
|
∟ Number of participants followed at least x days from randomization stratified by treatment and region
Number of participants followed at least x days from randomization stratified by treatment and region |
|
|
# Make a dataset with many records per subject
d <- data.frame(days = sample(30:365, 1000, replace=TRUE),
subjid = sample(1:100, 1000, replace=TRUE))
# Get a richer set of output when there is not stratification
nriskReport(days ~ id(subjid), data=d)
∟ Number of participants followed at least x days from randomization
Number of participants followed at least x days from randomization |
|
|
∟ Distributions of follow-up contacts, with times in days
Distributions of follow-up contacts, with times in days. Top left panel is a histogram showing the distribution of the number of contacts per participant. Top right panel is a histogram showing the distribution of time from randomization to all 1000 contacts. Bottom left panel is a histogram showing the distribution of the longest time gap between contacts per participant. Bottom right panel shows the relationship between the time lapse between randomization and last contact per participant and the average number of contacts for the participant. |
|
|
Time to Hospitalization and Surgery
set.seed(1)
n <- 400
dat <- data.frame(t1=runif(n, 2, 5), t2=runif(n, 2, 5),
e1=rbinom(n, 1, .5), e2=rbinom(n, 1, .5),
cr1=factor(sample(c('cancer','heart','censor'), n, TRUE),
c('censor', 'cancer', 'heart')),
cr2=factor(sample(c('gastric','diabetic','trauma', 'censor'),
n, TRUE),
c('censor', 'diabetic', 'gastric', 'trauma')),
treat=sample(c('a','b'), n, TRUE))
dat$t1[1:7] <- NA
dat <- upData(dat,
labels=c(t1='Time to operation',
t2='Time to rehospitalization',
e1='Operation', e2='Hospitalization',
treat='Treatment'),
units=c(t1='Year', t2='Year'), print=FALSE)
denom <- c(enrolled=n + 40, randomized=400, a=sum(dat$treat=='a'),
b=sum(dat$treat=='b'))
sethreportOption(denom=denom, tx.var='treat')
survReport(Surv(t1, e1) + Surv(t2, e2) ~ treat, data=dat)
∟ Kaplan-Meier cumulative incidence estimates for operation stratified by treatment
Kaplan-Meier cumulative incidence estimates for operation stratified by treatment, along with half-height of 0.95 confidence limits for differences centered at estimate midpoints. \(N\)=393. |
|
|
∟ Kaplan-Meier cumulative incidence estimates for hospitalization stratified by treatment
Kaplan-Meier cumulative incidence estimates for hospitalization stratified by treatment, along with half-height of 0.95 confidence limits for differences centered at estimate midpoints. \(N\)=400. |
|
|
# Show estimates combining treatments
survReport(Surv(t1, e1) + Surv(t2, e2) ~ 1, data=dat,
what='S', times=3, ylim=c(.1, 1))
∟ Kaplan-Meier estimates for operation
Kaplan-Meier estimates for operation, along with pointwise 0.95 confidence bands. \(N\)=393. |
|
|
∟ Kaplan-Meier estimates for hospitalization
Kaplan-Meier estimates for hospitalization, along with pointwise 0.95 confidence bands. \(N\)=400. |
|
|
# Cumulative incidence under completing risks
survReport(Surv(t1, cr1) ~ treat, data=dat, cause=c('cancer', 'heart'))
∟ Cumulative incidence of cancer with competing event heart stratified by treatment
Cumulative incidence of cancer with competing event heart stratified by treatment, along with half-height of 0.95 confidence limits for differences centered at estimate midpoints. \(N\)=393. |
|
|
∟ Cumulative incidence of heart with competing event cancer stratified by treatment
Cumulative incidence of heart with competing event cancer stratified by treatment, along with half-height of 0.95 confidence limits for differences centered at estimate midpoints. \(N\)=393. |
|
|
Adverse Events
For this example, the denominators for the two treatments in the pop-up needles will be incorrect because the dataset did not have subject IDs.
# Original source of aeanonym: HH package
# aeanonym <- read.table(hh("datasets/aedotplot.dat"), header=TRUE, sep=",")
# Modified to remove denominators from data and to generate raw data
# (one record per event per subject)
ae <-
structure(list(RAND = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("a",
"b"), class = "factor"), PREF = structure(c(12L, 12L,
18L, 18L, 26L, 26L, 33L, 33L, 5L, 5L, 27L, 27L, 6L, 6L, 15L,
15L, 22L, 22L, 23L, 23L, 31L, 31L, 17L, 17L, 2L, 2L, 3L, 3L,
13L, 13L, 25L, 25L, 28L, 28L, 14L, 14L, 4L, 4L, 8L, 8L, 19L,
19L, 21L, 21L, 29L, 29L, 10L, 10L, 20L, 20L, 16L, 16L, 32L, 32L,
11L, 11L, 1L, 1L, 30L, 30L, 24L, 24L, 9L, 9L, 7L, 7L),
.Label = tolower(c("ABDOMINAL PAIN",
"ANOREXIA", "ARTHRALGIA", "BACK PAIN", "BRONCHITIS", "CHEST PAIN",
"CHRONIC OBSTRUCTIVE AIRWAY", "COUGHING", "DIARRHEA", "DIZZINESS",
"DYSPEPSIA", "DYSPNEA", "FATIGUE", "FLATULENCE", "GASTROESOPHAGEAL REFLUX",
"HEADACHE", "HEMATURIA", "HYPERKALEMIA", "INFECTION VIRAL", "INJURY",
"INSOMNIA", "MELENA", "MYALGIA", "NAUSEA", "PAIN", "RASH", "RESPIRATORY DISORDER",
"RHINITIS", "SINUSITIS", "UPPER RESP TRACT INFECTION", "URINARY TRACT INFECTION",
"VOMITING", "WEIGHT DECREASE")), class = "factor"), SAE = c(15L,
9L, 4L, 9L, 4L, 9L, 2L, 9L, 8L, 11L, 4L, 11L, 9L, 12L, 5L, 12L,
7L, 12L, 6L, 12L, 6L, 12L, 2L, 14L, 2L, 15L, 1L, 15L, 4L, 16L,
4L, 17L, 11L, 17L, 6L, 20L, 10L, 23L, 13L, 26L, 12L, 26L, 4L,
26L, 13L, 28L, 9L, 29L, 12L, 30L, 14L, 36L, 6L, 37L, 8L, 42L,
20L, 61L, 33L, 68L, 10L, 82L, 23L, 90L, 76L, 95L)), .Names = c("RAND",
"PREF", "SAE"), class = "data.frame", row.names = c(NA,
-66L))
subs <- rep(1 : nrow(ae), ae$SAE)
ae <- ae[subs, c('RAND', 'PREF')]
names(ae) <- c('treat', 'event')
label(ae$treat) <- 'Treatment'
denom <- c(enrolled=1000,
randomized=400,
a=212, b=188)
sethreportOption(tx.var='treat', denom=denom)
eReport(event ~ treat, data=ae, minincidence=.05)
∟ Proportion of adverse events by Treatment
Proportion of adverse events by Treatment sorted by descending risk difference. 3 events with less than 0.05 incidence in all groups are not shown ( :Hyperkalemia, :Rash, :Weight Decrease). |
|
To show hazard rate estimates from exponential distributions (events per person time of exposure), make up total exposure time for patients on each treatment, in person-years.
∟ Rate of adverse events by Treatment
Rate of adverse events by Treatment sorted by descending hazard rate ratio |
|