# Introduction to Bayes for Evaluating Treatments

# Preface

P-values and the p < 0.05 rule of thumb came into use before the computing revolution. Assuming the null hypothesis is true greatly simplified the model, often requiring only manual calculations. But the traditional straw-man null hypothesis testing approach to establishing statistical evidence about efficacy or safety of a drug has a number of deficiencies, many of them caused by the indirectness of the approach, including the use of probabilities of events that already occurred conditional on facts that are unknown. P-values and type I errors are often assumed to provide the error probabilities regulators need, but the chance of approving an ineffective drug given the data is actually the direct Bayesian posterior probability that efficacy falls below an acceptable level. Furthermore, the rules of logic supporting proof by contradiction, based on certainties and not probabiities, don’t apply to the uncertainties of traditional null hypothesis testing. Just as in medical diagnosis, forward probabilistic thinking leads to optimum decisions. The Bayesian approach involves direct estimation of time-forward probabilities of clinical interest and does not need to concern itself with long-run operating characteristics such as the number of false positives from a large set of imagined exact replications of exactly null clinical trials. Instead, Bayesian methods aim to maximize the probability of making the best decision about drug efficacy and safety for the single problem at hand. The Bayesian approach applies to complex study designs, incorporates applicable prior information, results in cleaner interpretations on a clinical as opposed to randomness scale, and provides a fully self-contained model-based approach to inference needing no after-the-fact adjustments for context/multiplicities. In some ways frequentist hypothesis testing involves modeling noise while Bayesian inference involves modeling signal.

The Bayesian approach is outlined, and demonstrated through relatively simple simulations. By focusing on an extreme example in which one analyzes the data up to 500 times for 500 subjects, the advantages of Bayes for evidence generation and saving sample size by earlier stopping are shown. A simulated parallel-group randomized clinical trial with two efficacy endpoints is used to demonstrate how Bayes is used to quantify evidence for efficacy with joint probabilities involving both outcomes, something impossible to calculate in the frequentist paradigm.

Bayesian methods should be used for simple problems as well as for complex situations such as adaptive designs and use of prior data where a frequentist solution is not available. When sponsors and reviewers are comfortable applying and interpreting Bayesian methods in simple cases they will be more able to interpret results in complex situations. When relevant prior data are not available for incorporation into the prior distribution, a little skepticism goes a long way.