Analysis Methods in Small Populations

Frank Harrell

Department of Biostatistics, Vanderbilt University

Expert Biostatistics Advisor, FDA CDER

`hbiostat.org`

Discussion Board: `datamethods.org`

2023-05-03

- Rare disease perspective
- Making learning process more formal with regular updating
- knew this \(\rightarrow\) saw this \(\rightarrow\) now know this \(\rightarrow\) …

- Bayes is
**the**approach to make integration of pre-study knowledge and new data transparent - Straightforward statements of evidence from (study + pre-study)
- Static vs. dynamic borrowing through different designs/models/priors
- instead of “dynamic” perhaps “just works better”

- Listing of data sources
- Overview of knowledge/belief elicitation
- Intuitive ways to quantify/select borrowing
- Impact of borrowing on results
- Bayesian decision rule for single-arm study
- choice of response probability threshold seems arbitrary

- Use of predictive probabilities to enable continuous learning
- assumes the \(N\) goal is “magic”

- Master protocols, peds example
- Mild disagreement: type I assertion probability \(\alpha\) is often counter-productive to making best decisions

- Bias and precision
- Novel adaptive designs, borrowing & combining info
- Information about unknowns as \(N \uparrow\)
- Advantages of Bayesian paradigm
- use of more information, better address uncertainty
- use all available info
- more frequent data looks

- Move away from discrete phases
- Adaptive w/early stopping for toxicity, futility, efficacy; dropping arms
- Handle complex endpoints
- State-of-the-art free software
- Special applications to dose-finding, toxicity limitations
- Minimizing expected \(N\)
- Platforms, master protocols, basket trials
- Avoiding noisy subgroup analysis through borrowing
- Continuous learning, flexibility
- “There is no free lunch. But there are lunch specials.”

- Value of within-subject comparisons
- Observational longitudinal data are valuable but watch for serious confounding/selection biases
- Increase in effective sample size, reduce confounding in obs. studies
- unit of analysis: pt response at one day

- Crossover and \(N\)-of-1
- Sequential randomization (stepped wedge, rand. withdrawal, delayed tx)
- Target trial approach to obs. longitudinal study
- not convinced about propensity methods
- have seen some researchers forget about confounding

- Obs. database fitness-for-purpose
- Raised several time-related considerations
- Introduction to self-controlled studies

- Borrowing from historical data requires
**extreme**diligence to avoid cherry picking, and requires**raw**source data- Borrowing without covariate adjustment is very problematic
- Less need for validation study if not borrowing from historical data

- Biggest bangs for the buck:
- High-resolution outcome with high test-retest reliability
- Doing away with sample size calculations
- Bayesian sequential designs
- emulate physicists: experiment until you have enough evidence for either superiority, inferiority, similarity, or toxicity

- Small \(N\)? \(\uparrow\) outcome information per pt
- Response information per day \(\times\) number of serial measurements/pt
- ARDS example: daily ordinal pt outcome for 28d

each 6d of data contained information about tx effect equivalent to a new pt

- Idealized example: bone mineral density from DEXA scans at 6m, 7m, 8m, …, 18m after randomization
- 60-level validated patient-oriented outcome scale with clinical event overrides
- Ordinal outcomes assume severities of outcomes can be ordered
- Don’t assume anything about spacings of outcome categories

- What may be the most important thing to borrow with a Bayesian design?
- Similarity of tx effect on “hard” clinical events to tx effect on pt functional status/disease severity/QOL

- Worst possible analysis: “responder” analysis
- Rare disease study with \(N=80\) and established ordinal outcome scale
- Reduced output to binary yes/no: whether or not ordinal scale was reduce \(x\) units below baseline value
- Effective sample size \(80 \rightarrow 30\)

- Studies of common diseases tend to not use high-resolution endpoints (and are long and expensive) because
- They can get away with it
- Regulators have not gotten used to borrowing of information across multiple outcomes