References
Adcock, C. J. (1997). Sample size determination: A review. The
Statistician, 46, 261–283.
Albert, J. (2007). Bayesian Computation with
R. Springer.
Altman, D. G., & Bland, J. M. (1995). Absence of evidence is not
evidence of absence. BMJ, 311, 485.
Bendtsen, M. (2018). A Gentle Introduction to the
Comparison Between Null Hypothesis Testing and
Bayesian Analysis: Reanalysis of Two
Randomized Controlled Trials. Journal of Medical Internet
Research, 20(10), e10873. https://doi.org/10.2196/10873
Using priors forces us to be more specific and
explicit about what we mean when we say that something is unknown... the
Bayesian approach does not attempt to identify a fixed value for the
parameters and dichotomize the world into significant and
nonsignificant, but rather relies on the researcher to do the scientific
inference and not to delegate this obligation to the statistical
model... the NHST approach is rooted in the idea of being able to redo
the experiment many times (so as to get a sampling distribution). Even
if we can rely on theoretical results to get this sampling distribution
without actually going back in time and redoing the experiment, the
underlying idea can be somewhat problematic. What do we mean by redoing
an experiment? Can we redo a randomized controlled trial while keeping
all things equal and recruiting a new sample from the study
population?... Once we remove ourselves from the dichotomization of
evidence, other things start to take precedence: critically assessing
the models chosen, evaluating the quality of the data, interpreting the
real-world impact of the results, etc.
Berry, D. A. (1987). Interim analysis in clinical trials:
The role of the likelihood principle. Am
Statistician, 41, 117–122. https://doi.org/10.1080/00031305.1987.10475458
Berry, D. A. (2006). Bayesian clinical trials. Nat Rev,
5, 27–36.
excellent review of Bayesian
approaches in clinical trials; "The greatest virtue of the traditional
approach may be its extreme rigour and narrowness of focus to the
experiment at hand, but a side effect of this virtue is inflexibility,
which in turn limits innovation in the design and analysis of clinical
trials. ... The set of “other possible results” depends on
the experimental design. ... Everything that is known is taken as given
and all probabilities are calculated conditionally on known values. ...
in contrast to the frequentist approach, only the probabilities of the
observed results matter. ... The continuous learning that is possible in
the Bayesian approach enables investigators to modify trials in
midcourse. ... it is possible to learn from small samples, depending on
the results, ... it is possible to adapt to what is learned to enable
better treatment of patients. ... subjectivity in prior distributions is
explicit and open to examination (and critique) by all. ... The Bayesian
approach has several advantages in drug development. One is the process
of updating knowledge gradually rather than restricting revisions in
study design to large, discrete steps measured in trials or
phases."
Blume, J. D. (2002). Likelihood methods for measuring statistical
evidence. Stat Med, 21(17), 2563–2599.
Blume, J. D. (2008). How often Likelihood ratios are
misleading in sequential trials. Comm Stat Th Meth,
37(8), 1193–1206.
Braun, T. M. (n.d.). Motivating sample sizes in adaptive Phase
I trials via Bayesian posterior credible intervals.
Biom, n/a. https://doi.org/10.1111/biom.12872
Briggs, W. M. (2017). The Substitute for
p-Values. JASA, 112(519), 897–898. https://doi.org/10.1080/01621459.2017.1311264
Cohen, J. (1994). The earth is round (p < .05). Am Psychologist,
49(12), 997–1003. https://doi.org/10.1037/0003-066x.49.12.997
Cook, R. J., & Farewell, V. T. (1996). Multiplicity considerations
in the design and analysis of clinical trials. J Roy Stat Soc
A, 159, 93–110.
argues that if
results are intended to be interpreted marginally, there may be no need
for controlling experimentwise error rate. FH phrasing: Cook and
Farewell point out that when a strong priority order is pre-specified
for separate clinical questions, and that same order is also the
reporting order (no cherry picking), there is no need for multiplicity
adjustment. This is in contrast with a study whose aim is to find an
endpoint or find a patient subgroup that is benefited by treatment, a
situation requiring conservative multiplicity adjustment.
Dallow, N., Best, N., & Montague, T. H. (2018). Better decision
making in drug development through adoption of formal prior elicitation.
Pharm Stat, 0(0). https://doi.org/10.1002/pst.1854
Dawid, A. P. (2000). Comment on “the philosophy of
statistics” by D. V.
Lindley. The Statistician, 49, 325–326.
Deming, W. E. (1975). On Probability as a
Basis for Action. Am Statistician,
29(4), 146–152. https://doi.org/10.1080/00031305.1975.10477402
Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian
statistical inference for psychological research. Psych Rev,
70(3), 193–242. http://psycnet.apa.org/doi/10.1037/h0044139
Emerson, S. S. (1995). Stopping a clinical trial very early based on
unplanned interim analysis: A group sequential approach.
Biometrics, 51, 1152–1162.
Feinstein, A. R. (1977). Clinical Biostatistics.
C. V. Mosby.
Gelman, A. (2013). P Values and Statistical
Practice. Epi, 24(1), 69–72. https://doi.org/10.1097/ede.0b013e31827886f7
Gelman, A. (2015). Bayesian and Frequentist Regression
Methods. Stat Med, 34(7), 1259–1260. https://doi.org/10.1002/sim.6427
Gelman, A., & Hennig, C. (2017). Beyond subjective and objective
in statistics. http://www.stat.columbia.edu/̃gelman/research/published/objectivityr5.pdf
Goodman, S. N. (1999). Toward Evidence-Based Medical
Statistics. 1: The P Value Fallacy. Ann Int
Med, 130(12), 995+. https://doi.org/10.7326/0003-4819-130-12-199906150-00008
Nice language for what happens when scientists use
NHST to justify strong statements in their conclusions and
interpretation; p-value fallacy
Greenland, S., Senn, S. J., Rothman, K. J., Carlin, J. B., Poole, C.,
Goodman, S. N., & Altman, D. G. (2016). Statistical tests,
P values, confidence intervals, and power: A guide to
misinterpretations. Eur J Epi, 31(4), 337–350. https://doi.org/10.1007/s10654-016-0149-3
Best article on misinterpretation of p-values. Pithy
summaries.
Greenwald, A. G., Gonzalez, R., Harris, R., & Guthrie, D. (1996).
Effect sizes and p values: What should be reported and what
should be replicated? Psychophysiology, 33(2),
175–183. https://doi.org/10.1111/j.1469-8986.1996.tb02121.x
Grouin, J.-M., Coste, M., Bunouf, P., & Lecoutre, B. (2007).
Bayesian sample size determination in non-sequential clinical trials:
Statistical aspects and some regulatory considerations.
Stat Med, 26, 4914–4924.
Ionan, A. C., Clark, J., Travis, J., Amatya, A., Scott, J., Smith, J.
P., Chattopadhyay, S., Salerno, M. J., & Rothmann, M. (2022).
Bayesian Methods in Human Drug and
Biological Products Development in CDER and
CBER. Ther Innov Regul Sci. https://doi.org/10.1007/s43441-022-00483-0
Examples of use of Bayes at FDA CDER and CBER
Joseph, L., & Bélisle, P. (1997). Bayesian sample size determination
for normal means and differences between normal means. The
Statistician, 46, 209–226.
Kopp‐Schneider, A., Calderazzo, S., & Wiesenfarth, M. (2019). Power
gains by using external information in clinical trials are typically not
possible when requiring strict type I error control.
Biometrical Journal, 0(0). https://doi.org/10.1002/bimj.201800395
Kruschke, J. K. (2013). Bayesian estimation supersedes the t test. J
Exp Psych, 142(2), 573–603. https://doi.org/10.1037/a0029146
Kruschke, J. K. (2015). Doing Bayesian Data Analysis:
A Tutorial with R, JAGS, and
Stan (Second Edition). Academic Press. http://www.sciencedirect.com/science/book/9780124058880
Kruschke, J. K., & Liddell, T. M. (2017). Bayesian data analysis
for newcomers. 1–23. https://doi.org/10.3758/s13423-017-1272-1
Excellent for teaching Bayesian methods and explaining
the advantages
Kunzmann, K., Grayling, M. J., Lee, K. M., Robertson, D. S., Rufibach,
K., & Wason, J. M. S. (2021). A Review of
Bayesian Perspectives on Sample Size
Derivation for Confirmatory Trials. The American
Statistician, 0(0), 1–9. https://doi.org/10.1080/00031305.2021.1901782
Laptook, A. R., Shankaran, S., Tyson, J. E., Munoz, B., Bell, E. F.,
Goldberg, R. N., Parikh, N. A., Ambalavanan, N., Pedroza, C., Pappas,
A., Das, A., Chaudhary, A. S., Ehrenkranz, R. A., Hensman, A. M., Van
Meurs, K. P., Chalak, L. F., Hamrick, S. E. G., Sokol, G. M., Walsh, M.
C., … Higgins, R. D. (2017). Effect of Therapeutic Hypothermia
Initiated After 6 Hours of Age on
Death or Disability Among Newborns With
Hypoxic-Ischemic Encephalopathy. JAMA, 318(16),
1550+. https://doi.org/10.1001/jama.2017.14972
Lindley, D. V. (1993). The Analysis of Experimental
Data: The Appreciation of Tea and
Wine. Teaching Statistics, 15(1), 22–25.
https://doi.org/10.1111/j.1467-9639.1993.tb00252.x
Mark, D. B., Lee, K. L., & Harrell, F. E. (2016). Understanding the
Role of P Values and Hypothesis
Tests in Clinical Research. JAMA Card,
1(9), 1048–1054. https://doi.org/10.1001/jamacardio.2016.3312
Maxwell, N. (2004). Data Matters: Conceptual
Statistics for a Random World. Key
College Pub. https://books.google.com/books?id=KH5GAAAAYAAJ
McElreath, R. (2016). Statistical rethinking : A
Bayesian course with examples in R and
Stan. http://www.worldcat.org/isbn/9781482253443
Natanegara, F., Neuenschwander, B., Seaman, J. W., Kinnersley, N.,
Heilmann, C. R., Ohlssen, D., & Rochester, G. (2014). The current
state of Bayesian methods in medical product development:
Survey results and recommendations from the DIA Bayesian
Scientific Working Group. Pharm Stat, 13(1),
3–12. https://doi.org/10.1002/pst.1595
Nuzzo, R. (2014). Scientific method: Statistical errors.
Nature News, 506(7487), 150. https://doi.org/10.1038/506150a
Oakes, M. (1986). Statistical Inference: A
Commentary for the Social and Behavioral
Sciences. Wiley.
"It is
incomparably more useful to have a plausible range for the value of a
parameter than to know, with whatever degree of certitude, what single
value is untenable."
Pezeshk, H., & Gittins, J. (2002). A fully Bayesian
approach to calculating sample sizes for clinical trials with binary
reponses. Drug Info J, 36, 143–150.
Rigat, F. (2023). A conservative approach to leveraging external
evidence for effective clinical trial design. Pharmaceutical
Statistics, pst.2339. https://doi.org/10.1002/pst.2339
Includes some sample size considerations to ensure
that the prior is not too impactful
Rozeboom, W. (1960). The Fallacy of the
Null-Hypothesis Significance Test. Psychological
Bulletin, 57, 416.
Ruberg, S. J., Beckers, F., Hemmings, R., Honig, P., Irony, T., LaVange,
L., Lieberman, G., Mayne, J., & Moscicki, R. (2023). Application of
Bayesian approaches in drug development: Starting a
virtuous cycle. Nat Rev Drug Discov, 1–16. https://doi.org/10.1038/s41573-023-00638-0
Senn, S. (2013). Being Efficient About Efficacy Estimation.
Statistics in Biopharmaceutical Research, 5(3),
204–210. https://doi.org/10.1080/19466315.2012.754726
"Every time the statistician working in the
pharmaceutical industry does a sample size determination for a trial
using a responder analysis, he or she should do the same calculation
using the original measure. If the dichotomy is preferred, an
explanation as to why the extra millions are going to be spent should be
provided."
Simon, R., & Freedman, L. S. (1997). Bayesian design and analysis of
two two factorial clinical trials. Biometrics, 53,
456–464.
Spiegelhalter, D. J. (1986). Probabilistic prediction in patient
management and clinical trials. Stat Med, 5, 421–433.
https://doi.org/10.1002/sim.4780050506
z-test for calibration inaccuracy (implemented in
Stata, and R Hmisc package’s val.prob function)
Spiegelhalter, David J., Abrams, K. R., & Myles, J. P. (2004).
Bayesian Approaches to Clinical Trials and
Health-Care Evaluation. Wiley.
Spiegelhalter, David J., & Freedman, L. S. (1986). A predictive
approach to selecting the size of a clinical trial, based on subjective
clinical opinion. Stat Med, 5, 1–13. https://doi.org/10.1002/sim.4780050103
Spiegelhalter, David J., Freedman, L. S., & Parmar, M. K. B. (1993).
Applying Bayesian ideas in drug development and clinical
trials. Stat Med, 12, 1501–1511. https://doi.org/10.1002/sim.4780121516
Vickers, A. J. (2008). Decision analysis for the evaluation of
diagnostic tests, prediction models, and molecular markers. Am
Statistician, 62(4), 314–320.
limitations of accuracy metrics;incorporating clinical
consequences;nice example of calculation of expected outcome;drawbacks
of conventional decision analysis, especially because of the difficulty
of eliciting the expected harm of a missed diagnosis;use of a threshold
on the probability of disease for taking some action;decision curve;has
other good references to decision analysis
Wagenmakers, E.-J., Marsman, M., Jamil, T., Ly, A., Verhagen, J., Love,
J., Selker, R., Gronau, Q. F., ̌Sḿıra, M., Epskamp, S., Matzke, D.,
Rouder, J. N., & Morey, R. D. (2017). Bayesian inference for
psychology. Part I: Theoretical advantages and
practical ramifications. 1–23. https://doi.org/10.3758/s13423-017-1343-3
Wang, H., Chow, S.-C., & Chen, M. (2005). A Bayesian
Approach on Sample Size Calculation for
Comparing Means. J Biopharm Stat, 15(5),
799–807. https://doi.org/10.1081/bip-200067789
analytic form for posterior for normal t-test
case
Weber, K., Hemmings, R., & Koch, A. (2018). How to use prior
knowledge and still give new data a chance? Pharmaceutical
Statistics, 17(4), 329–341. https://doi.org/10.1002/pst.1862
Whitehead, J., Cleary, F., & Turner, A. (2015). Bayesian sample
sizes for exploratory clinical trials comparing multiple experimental
treatments with a control. Stat Med, 34(12),
2048–2061. https://doi.org/10.1002/sim.6469
Wiesenfarth, M., & Calderazzo, S. (2019). Quantification of
Prior Impact in Terms of Effective
Current Sample Size. Biometrics, 0. https://doi.org/10.1111/biom.13124