Implications of the Draft FDA Bayesian Guidance

Frank Harrell

Department of Biostatistics
Vanderbilt University School of Medicine
Nashville Tennessee USA

AstraZenica Global Statistical Forum
Washington Statistical Society

2026-05-21

FDA Draft Bayesian Guidance

  • Use of Bayesian Methodology in Clinical Trials of Drug and Biological Products Guidance for Industry
  • Released January 2026
  • Longstanding efforts of many biostatisticians in FDA CDER & CBER

Immediate Benefits

  • Satisfies a long-time need at FDA
  • CDRH published a Bayesian guidance 20 years ago under the leadership of Greg Campbell
  • Drugs and Biologics have never had one
  • Further legitimizes Bayes in general, not just for special cases
  • Provides a pathway for Bayesian designs and analyses
  • Communicates to industry that Bayesian proposals are welcomed
  • awareness of Bayes and Bayes training at pharma & FDA (statisticians + reviewers)

What Problems Needed To Be Addressed?

  • Reluctance of sponsors to use Bayesian methods in Phase III trials
  • flexibility of RCT designs
  • Better actionability of efficacy evidence measures
  • Formal incorporation of extra-study information
  • Pathway to less arbitrary handling of multiplicities
  • Paradigm simplification

Paradigm Simplification

  • All applications of Bayes were previously required to compute
  • requires complex simulations that violated the likelihood principle
  • Had to factor in intentions

Example Need for Paradigm Simplification

  • Asthmatx Alair bronchial thermoplasty for refractory asthma
  • Placebo bronchial ablation - strong control
  • Bayesian decision criterion, flat prior:
    • P(benefit) > 0.99 at interim look
    • P(benefit) > 0.964 at final look
  • Controls overall at 0.05

Example, continued

  • Enrollment far exceeded expectations
  • No time for interim look
  • Final posterior prob. 0.96
  • Failure of primary endpoint, which required 0.964
  • Failed because of an intended look that never happened
  • Hybrid Bayesian/frequentist procedures create complexity and confusion and violate the likelihood principle (data that could be obtained but were not are irrelevant)

fharrell.com/post/hybrid

Evidence Quantification and Actionability

  • Longstanding problem in statistics: was never an error probability
  • = probability of making an assertion
  • Cannot call something an error by assuming the very fact that would make it one
  • Translating P(getting data even more impressive if treatment doesn't work) to P(treatment works) is a leap
  • Regulator's regret cannot be computed from or -values
  • It is 1 - P(efficacy > 0 | data, prior)

Asking a Bayesian to Compute is Like ...

  • telling a patient the specificity of the test he already underwent (P(test - | no disease))
  • asking a poker player winning $10M/year to justify his ranking by how often he places bets in games he didn't win
  • judging a politician by how often he speaks when he's honest instead how often he lies when he speaks

Historical Note: Illusion of Objectivity

Fisher touted his approach as being objective by

  • sneakily changing the question (does the treatment work?)
  • to a different question that could be answered without computers and without any prior knowledge:
    • If the treatment doesn't work, how surprised should we be by data more impressive than ours?

Paradigm Contrast in a Nutshell

  • Do you want the probability of a positive result when there is nothing?
  • Or the probability that a positive result turns out to be nothing?

Most Important Aspects of the Guidance

  • Documents that Bayes has its own operating characteristics
  • These OCs having nothing to do with
  • They are P(correct decisions from data)
  • Bayesian OCs can only be poor when study generating prior (simulation/sampling prior) is at odds with analysis prior

OC Simulations of Interest

  • Simulate P(correct decision) under mismatch of priors and
  • Example decision rule:
  • Primary OC is
  • Simulate 1000 RCTs with a universe of effects =
  • Find the subset of the 1000 for which is triggered (under prior )
  • Accuracy = proportion of this subset for which generating the data (from ) was actually

hbiostat.org/bayes/design

Guidance Provides Two Paths

  • Not thinking hard about the prior
    • Noninformative prior (allows for huge treatment effects)
    • Demonstrate is conserved
  • Thinking hard about the prior
    • Demonstrate reliability of the efficacy decision rule
    • excellent pure Bayesian OCs

Multiplicity

  • Frequentist: Multiplicity comes from chances you give data to be extreme (e.g., multiple data looks)
    • And from detaching clinical and statistical significance
  • Bayesian: No multiplicity from sequential looks
    • New looks merely make previous looks obsolete
    • No other real multiplicities, just assertions of varying stringency

fharrell.com/post/nfl

Challenges of Multiplicity Adjustments

  • Not seen as logical in fields outside statistics
    • E.g. why should evidence for A vs B be discounted because we also examined C vs D?
  • Arbitrary with no general principle to dictate choices
  • Done on the randomness scale; should be on the clinical scale

Bayes Provides Logical Clinical Ways to Handle Multiplicities

  • Multiplicity has been miscast as a randomness issue all along
  • Multiplicity setting the bar too low in the assertions
  • E.g. easy for P(treatment benefit on of endpoints) to be high
  • Intersection assertions: Asserting that 4 things are simultaneously true is a high bar

Dealing with Union Assertions

  • Make the bar high enough by demanding evidence for more than trivial effects
  • E.g. P(any mortality effect or 10% reduction in nonfatal endpoint) > 0.95

fharrell.com/bayes/bet/multiplicity

Where Do Frequentists Need to Spend Time?

  • data model specification
  • approximations to complex sampling distributions and use of method (with poor confidence coverage)
  • asymptotics
  • sufficient statistics and ancillarity
  • simulating accounting for intentions
  • multiplicity adjustments
  • gauging evidence for compound assertions
  • ...

Frequentists Don't Spend Enough Time on Sampling Distributions

  • Aside from simple cases, exact p-values and CLs are seldom available
  • When interim analyses are done
    • sampling distributions are extremely complex (can be bimodal)
    • no RCT publications include correct point estimates and CLs
  • Even commonly used models such as binary logistic regression and nonlinear mixed effect models have p-values and CLs of uncertain accuracy

Where Do Bayesians Need to Spend Time?

  • data model specification
  • choice of prior
  • compute time
  • diagnostics for convergence of posterior samples

Summary

The FDA draft Bayesian guidance

  • establishes Bayesian methods as viable options in all contexts
  • demonstrates the practicality of Bayes
  • establishes that Bayes has its own OCs that are distinct from
  • justifies use of pure Bayesian OCs if the prior is well justified or simulations show that reasonable vs. conflicts still result in excellent decision accuracy

Summary, continued

The guidance

  • aids in understanding that is not an error probability
  • outlines approaches for prior specification
  • generates more exposure, interest, and demand for training in Bayesian methods

More Information

  • hbiostat.org/bayes
  • Lee, Harrell, LaVange, Spiegelhalter, JAMA 2026
  • hbiostat.org/bayes/design - simulation of Bayesian sequential trial and OCs
  • fharrell.com/post/bthink - Bayesian thinking
  • fharrell.com/post/journal - My Bayesian journey
  • hbiostat.org/bbr/alpha - vs. decision errors
  • fda.gov/media/190505/download

Usage: marp --html bguide.md

See https://www.hashbangcode.com/article/seven-tips-getting-most-out-marp https://yootheme.com/support/question/7348