---
theme: gaia
_class: lead
size: 16:9
style: |
  .small-text {
    font-size: 0.75rem;
  }
  /* Use this if a logo is wanted, and comment out the next section
      width and height match the natural aspect ratio ~ 11:4
  section::before {
    content: '';
    position: absolute;
    bottom: 0px;
    left: 0px;
    width: 225px; 
    height: 82px;
    background-image: url('https://hbiostat.org/img/vumc-logo.png');
    background-size: contain;
    background-repeat: no-repeat;
    background-position: center;
    z-index: 999;
  }  */
  
  section::before {
    content: 'Department of Biostatistics\AVanderbilt University School of Medicine';
    white-space: pre;
    position: absolute;
    bottom: 0px;
    left: 0px;
    font-size: 0.90rem;
    /* font-weight: bold; */
    color: blue;
    z-index: 999;
  }

  section.lead::before,
  section.nologo::before {
    display: none;
  }
  section.fullimage::before {
    display: none;   /* suppresses logo */
  }
  section.fullimage footer {
    display: none;   /* suppresses footer */
  }
paginate: true
backgroundColor: #fff
_paginate: false
marp: true
---

<!-- Usage: marp --html rct-questions.md

       To suppress the logo on one slide: put the following inside a standard
       HTML comment on its own line separated by blank lines:  _class: nologo
       
       To include a graphic on a slide by itself and allow it to take up the
       full space, put the following on a line by itself inside a comment and
       sep. by blank lines:  _class: fullimage
       Then include the image using e.g. ![bg fit](fig.svg) surrounded by blank lines.
       
                                                                                                                                   -->


# Questions We Forget To Ask When Designing an RCT

Frank Harrell

<p class="small-text">Department of Biostatistics<br>
Vanderbilt University School of Medicine<br><br>
DIDACT Symposium
2026-04-16</p>

---

## Are You Sure Hypothesis Testing Is the Best Framework?

* Aren't questions more useful than hypotheses?
* Isn't estimating the **amount** of effectiveness the most relevant goal?
*  What about basing N and statistical design on precision?
   + Stay tuned for Emily's presentation
* Or using a Bayesian design to compute P(benefit > $\epsilon$)

---

## Do You Need to Demonstrate a Benefit ≥ MCID?

* Observed benefit will have to be non-trivially > MCID to declare success
* Will you consider instead demonstration of a benefit > $\epsilon$?
   + $\epsilon$ = threshold for trivial treatment effect or minimum observable treatment effect, e.g., $\frac{\text{MCID}}{2}$

---

## Are You Aware That Most Fixed N Designs End Equivocally?

* p > 0.05
  + Failed to generate sufficient evidence at the current N to refute the supposition that the treatment is ignorable, at the completely arbitrary $\alpha=0.05$ level
  + Wide confidence interval $\rightarrow$ we know no more than before the study
  + We mainly know the money was spent
  
---

## Equivocal Results, _continued_

* Equivocal results are the $2^\text{nd}$ most common RCT result
* What if randomizing 40 more patients resulted in definitive evidence
* **Avoid** getting to planned study end without reaching a conclusion

---

## Do You Really Need a Fixed Sample Size?

* Will a sequential design work instead?
   + Does the disease/treatment lend itself to sequential trials?
   + Kelley Kidwell will be taking this a major step forward with SMART designs
* Frequentist group sequential design
   + Limited number of looks, fixed maximum $N$
* Bayesian sequential design
   + Unlimited looks, no fixed maximum $N$

---

## Do You Want to Possibly Stop Early for Futility?

* Fixed N designs ending with p > 0.05 at max $N$ typically could have stopped around $\frac{N}{3}$ with the same result
* More general to think of stopping early for inefficacy
* Inefficacy = effect $< \epsilon$, $\epsilon=$ trivial effect threshold
* Stopping for harm, zero benefit, or less than trivial benefit
* Much earlier stopping than using effect $< 0$
* See [this](https://hbiostat.org/bayes/design)

---

## Quiz

* How far along in an RCT can you have a $n(0,1)$  $z$ statistic $= 1$ and still have a good chance of ultimate success?
* Answer: $\frac{4}{10}$
* How far along in an RCT can you have treatment outcomes in the wrong direction ($z < 0$) and still have a good chance of ultimate success?
* Answer: $< \frac{1}{10}$
* See [Spiegelhalter 1993](https://hbiostat.org/bayes/bet/design#sequential-monitoring-and-futility-analysis)

---

## How Many Follow-Ups Can You Afford?

* What is the maximum number of follow-ups you can afford and patients will tolerate?
* Are you aware that longitudinal data makes each patient contribute more than 1 patient of information?
   + More dense longitudinal data → higher power

---

## Primary Endpoint Considerations

* If there is only one clear primary endpoint, do you have a solid MCID for it?
* If you don't have a solid single MCID it's best to have an uncertainty distribution for MCID
* $\rightarrow$ Bayesian power / _assurance_

---

## Are You Aware That Binary Outcomes Have Minimum Information?

* RCTs with binary Y are larger and **still have lower power** than RCTs with continuous Y
  + See [van Zwet, Harrell, Senn 2026 Stat in Med](https://onlinelibrary.wiley.com/doi/10.1002/sim.70402)
* Better information, power, and interpretability comes from [breaking ties in Y](https://fharrell.com/post/ordinal-info)
  + Time to first event violates PH and hides mixtures of event types
  + Quit ignoring deaths that occur after a first nonfatal event
  
---

## Multiple Important Outcomes: Key Questions

* What is your idea for MCID for every outcome for power calculation and interpretation?
* If different outcomes move in different directions, how do you know which treatment results in patients faring better?
* Can you translate "which treatment improves patient outcomes" to a solid analysis plan?

---

## Multiple Important Outcomes: Analysis Approaches

* Rank order severity of outcomes as of a given day — ask which treatment yields more days in better outcome states for patients
* If you can't rank outcomes, would you consider a general effectiveness assessment?
   + E.g. Bayesian P(treatment benefit on ≥ 2 outcomes out of 5) > 0.95
   
---

## Are You Aware of Alternatives to Multiplicity Adjustments?

* Prioritization of hypotheses, pre-specification of reporting order: [Cook and Farewell 1996 JRSSA](https://www.jstor.org/stable/2983471)
* Raise the bar for assertions
   + Bayesian P(benefit on ≥ 2 outcomes)
   + Evidence for > 20% benefit on at least one outcome
   
---

## Summary

* Many RCTs are designed on a wing and a prayer and don't consider many uncertainties
* Resource waste is not envisioned at the start but is lamented at the end
* Avoid the frequentist multiplicity mess
* Trialists are averse to change; statisticians need to show leadership
* If you are content with the status quo, don't ask too many questions


