• The Choice Between Pearson's <i>χ</i><sup>2</sup> Test and Fisher's Exact Test for 2 × 2 Tables

    Item Type Journal Article
    Author Markus Neuhäuser
    Author Graeme D. Ruxton
    Abstract ABSTRACT Pearson's asymptotic χ 2 test is often used to compare binary data between two groups. However, when the sample sizes or expected frequencies are small, the test is usually replaced by Fisher's exact test. Several alternative rules of thumb exist for defining “small” in this context. Replacing one test with another based on the obtained data is unusual in statistical practice. Moreover, this commonly‐used switch is unnecessary because Pearson's χ 2 test can easily be carried out as an exact test for any sample sizes. Therefore, we recommend routinely using an exact test regardless of the obtained data. This change of approach allows prespecifying a particular test and a much less ambiguous and more reliable analysis.
    Date 05/2025
    Language en
    Library Catalog DOI.org (Crossref)
    URL https://onlinelibrary.wiley.com/doi/10.1002/pst.70012
    Accessed 3/30/2025, 8:06:57 AM
    Volume 24
    Pages e70012
    Publication Pharmaceutical Statistics
    DOI 10.1002/pst.70012
    Issue 3
    Journal Abbr Pharmaceutical Statistics
    ISSN 1539-1604, 1539-1612
    Date Added 3/30/2025, 8:06:57 AM
    Modified 3/30/2025, 8:07:24 AM

    Tags:

    • fishers-exact-test
    • pearson-chi-squared-test

    Notes:

    • Getting an exact test based on Pearson chi-square

  • Probability‐scale residuals for continuous, discrete, and censored data

    Item Type Journal Article
    Author Bryan E. Shepherd
    Author Chun Li
    Author Qi Liu
    Abstract Abstract We describe a new residual for general regression models defined as , where y is the observed outcome and is a random variable from the fitted distribution. This probability‐scale residual (PSR) can be written as , whereas the popular observed‐minus‐expected residual can be thought of as . Therefore the PSR is useful in settings where differences are not meaningful or where the expectation of the fitted distribution cannot be calculated. We present several desirable properties of the PSR that make it useful for diagnostics and measuring residual correlation, especially across different outcome types. We demonstrate its utility for continuous, ordered discrete, and censored outcomes, including current status data, and with various models including Cox regression, quantile regression, and ordinal cumulative probability models, for which fully specified distributions are not desirable or needed, and in some cases suitable residuals are not available. The residual is illustrated with simulated data and real data sets from HIV‐infected patients on therapy in the southeastern United States and Latin America. The Canadian Journal of Statistics 44: 463–479; 2016 © 2016 Statistical Society of Canada , Résumé Les auteurs décrivent une nouvelle forme de résidus pour un modèle général de régression définis par , où y est la valeur observée et est une variable aléatoire suivant la distribution prescrite par le modèle ajusté. Lié à une échelle de probabilités, ce résidu peut s’écrire alors que la définition populaire correspond plutôt à . Le résidu proposé est donc utile si la différence entre la valeur observée et espérée de la définition populaire n'a pas de sens interprétable, ou lorsque la valeur espérée selon le modèle n'est pas calculable. Les auteurs présentent de nombreuses propriétés désirables de leurs résidus, rendant cette approche utile pour le diagnostic de modèles et le calcul de corrélations dans les résidus, surtout en présence d'observations de types différents. Ils illustrent son usage pour des données continues, ordonnées discrètes et censurées, y compris des données de statut actuel. Ils considèrent différents modèles dont la régression de Cox, la régression quantile et les modèles ordinaux de probabilités cumulatives. Les distributions implicites de ces modèles n'ont pas besoin d’être complètement définies et, dans certains cas, les résidus habituels sont simplement indisponibles. Ils illustrent leur nouvelle définition des résidus par des simulations et avec un jeu de données réelles portant sur des patients du VIH suivant une thérapie dans le sud‐est des États‐Unis ou en Amérique latine. La revue canadienne de statistique 44: 463–479; 2016 © 2016 Société statistique du Canada
    Date 12/2016
    Language en
    Library Catalog DOI.org (Crossref)
    URL https://onlinelibrary.wiley.com/doi/10.1002/cjs.11302
    Accessed 3/13/2025, 7:17:59 AM
    Rights http://onlinelibrary.wiley.com/termsAndConditions#vor
    Volume 44
    Pages 463-479
    Publication Canadian Journal of Statistics
    DOI 10.1002/cjs.11302
    Issue 4
    Journal Abbr Can J Statistics
    ISSN 0319-5724, 1708-945X
    Date Added 3/13/2025, 7:17:59 AM
    Modified 3/13/2025, 7:18:31 AM

    Tags:

    • ordinal
    • residuals
    • gof
    • residual-plot
  • Multiple Imputation for Longitudinal Data: A Tutorial

    Item Type Journal Article
    Author Rushani Wijesuriya
    Author Margarita Moreno‐Betancur
    Author John B. Carlin
    Author Ian R. White
    Author Matteo Quartagno
    Author Katherine J. Lee
    Abstract ABSTRACT Longitudinal studies are frequently used in medical research and involve collecting repeated measures on individuals over time. Observations from the same individual are invariably correlated and thus an analytic approach that accounts for this clustering by individual is required. While almost all research suffers from missing data, this can be particularly problematic in longitudinal studies as participation often becomes harder to maintain over time. Multiple imputation (MI) is widely used to handle missing data in such studies. When using MI, it is important that the imputation model is compatible with the proposed analysis model. In a longitudinal analysis, this implies that the clustering considered in the analysis model should be reflected in the imputation process. Several MI approaches have been proposed to impute incomplete longitudinal data, such as treating repeated measurements of the same variable as distinct variables or using generalized linear mixed imputation models. However, the uptake of these methods has been limited, as they require additional data manipulation and use of advanced imputation procedures. In this tutorial, we review the available MI approaches that can be used for handling incomplete longitudinal data, including where individuals are clustered within higher‐level clusters. We illustrate implementation with replicable R and Stata code using a case study from the Childhood to Adolescence Transition Study.
    Date 2025-02-10
    Language en
    Short Title Multiple Imputation for Longitudinal Data
    Library Catalog DOI.org (Crossref)
    URL https://onlinelibrary.wiley.com/doi/10.1002/sim.10274
    Accessed 1/30/2025, 8:43:51 AM
    Volume 44
    Pages e10274
    Publication Statistics in Medicine
    DOI 10.1002/sim.10274
    Issue 3-4
    Journal Abbr Statistics in Medicine
    ISSN 0277-6715, 1097-0258
    Date Added 1/30/2025, 8:43:51 AM
    Modified 1/30/2025, 8:44:26 AM

    Tags:

    • longitudinal
    • imputation
    • missing
    • serial
  • Joint modeling of longitudinal endpoints and its applications to trial planning, monitoring and analysis

    Item Type Journal Article
    Author Liangcai Zhang
    Author George Capuano
    Author Vladimir Dragalin
    Author John Jezorwski
    Author Kim Hung Lo
    Author Fei Chen
    Date 2025-04-20
    Language en
    Library Catalog DOI.org (Crossref)
    URL https://www.tandfonline.com/doi/full/10.1080/10543406.2025.2489280
    Accessed 4/24/2025, 5:57:16 PM
    Pages 1-15
    Publication Journal of Biopharmaceutical Statistics
    DOI 10.1080/10543406.2025.2489280
    Journal Abbr Journal of Biopharmaceutical Statistics
    ISSN 1054-3406, 1520-5711
    Date Added 4/24/2025, 5:57:16 PM
    Modified 4/24/2025, 5:57:44 PM

    Tags:

    • joint-model
    • multiple-endpoints
  • Inconsistent multiple testing corrections: The fallacy of using family-based error rates to make inferences about individual hypotheses

    Item Type Journal Article
    Author Mark Rubin
    Date 11/2024
    Language en
    Short Title Inconsistent multiple testing corrections
    Library Catalog DOI.org (Crossref)
    URL https://linkinghub.elsevier.com/retrieve/pii/S2590260124000067
    Accessed 1/13/2025, 8:55:45 AM
    Volume 10
    Pages 100140
    Publication Methods in Psychology
    DOI 10.1016/j.metip.2024.100140
    Journal Abbr Methods in Psychology
    ISSN 25902601
    Date Added 1/13/2025, 8:55:45 AM
    Modified 1/13/2025, 8:57:47 AM

    Tags:

    • multiple-comparison-procedures
    • multiplicity
    • p-value-adjustment
    • p-value
  • Clinical Prediction Models

    Item Type Book
    Author Ewout W. Steyerberg
    Date 2019
    Place New York
    Publisher Springer
    ISBN 3-030-16398-9
    Edition 2nd
    Date Added 7/7/2018, 1:38:33 PM
    Modified 5/3/2025, 4:30:51 PM
  • A Joint Model for (Un)Bounded Longitudinal Markers, Competing Risks, and Recurrent Events Using Patient Registry Data

    Item Type Journal Article
    Author Pedro Miranda Afonso
    Author Dimitris Rizopoulos
    Author Anushka K. Palipana
    Author Emrah Gecili
    Author Cole Brokamp
    Author John P. Clancy
    Author Rhonda D. Szczesniak
    Author Eleni‐Rosalina Andrinopoulou
    Abstract ABSTRACT Joint models for longitudinal and survival data have become a popular framework for studying the association between repeatedly measured biomarkers and clinical events. Nevertheless, addressing complex survival data structures, especially handling both recurrent and competing event times within a single model, remains a challenge. This causes important information to be disregarded. Moreover, existing frameworks rely on a Gaussian distribution for continuous markers, which may be unsuitable for bounded biomarkers, resulting in biased estimates of associations. To address these limitations, we propose a Bayesian shared‐parameter joint model that simultaneously accommodates multiple (possibly bounded) longitudinal markers, a recurrent event process, and competing risks. We use the beta distribution to model responses bounded within any interval without sacrificing the interpretability of the association. The model offers various forms of association, discontinuous risk intervals, and both gap and calendar timescales. A simulation study shows that it outperforms simpler joint models. We utilize the US Cystic Fibrosis Foundation Patient Registry to study the associations between changes in lung function and body mass index, and the risk of recurrent pulmonary exacerbations, while accounting for the competing risks of death and lung transplantation. Our efficient implementation allows fast fitting of the model despite its complexity and the large sample size from this patient registry. Our comprehensive approach provides new insights into cystic fibrosis disease progression by quantifying the relationship between the most important clinical markers and events more precisely than has been possible before. The model implementation is available in the R package JMbayes2 .
    Date 04/2025
    Language en
    Library Catalog DOI.org (Crossref)
    URL https://onlinelibrary.wiley.com/doi/10.1002/sim.70057
    Accessed 4/29/2025, 7:49:11 AM
    Volume 44
    Pages e70057
    Publication Statistics in Medicine
    DOI 10.1002/sim.70057
    Issue 8-9
    Journal Abbr Statistics in Medicine
    ISSN 0277-6715, 1097-0258
    Date Added 4/29/2025, 7:49:11 AM
    Modified 4/29/2025, 7:50:05 AM

    Tags:

    • bayes
    • joint-model
    • multiple-endpoints
    • shared-parameter
    • shared-parameter-models
  • A Comparison of Statistical Methods for Time‐To‐Event Analyses in Randomized Controlled Trials Under Non‐Proportional Hazards

    Item Type Journal Article
    Author Florian Klinglmüller
    Author Tobias Fellinger
    Author Franz König
    Author Tim Friede
    Author Andrew C. Hooker
    Author Harald Heinzl
    Author Martina Mittlböck
    Author Jonas Brugger
    Author Maximilian Bardo
    Author Cynthia Huber
    Author Norbert Benda
    Author Martin Posch
    Author Robin Ristl
    Abstract ABSTRACT While well‐established methods for time‐to‐event data are available when the proportional hazards assumption holds, there is no consensus on the best inferential approach under non‐proportional hazards (NPH). However, a wide range of parametric and non‐parametric methods for testing and estimation in this scenario have been proposed. To provide recommendations on the statistical analysis of clinical trials where non‐proportional hazards are expected, we conducted a simulation study under different scenarios of non‐proportional hazards, including delayed onset of treatment effect, crossing hazard curves, subgroups with different treatment effects, and changing hazards after disease progression. We assessed type I error rate control, power, and confidence interval coverage, where applicable, for a wide range of methods, including weighted log‐rank tests, the MaxCombo test, summary measures such as the restricted mean survival time (RMST), average hazard ratios, and milestone survival probabilities, as well as accelerated failure time regression models. We found a trade‐off between interpretability and power when choosing an analysis strategy under NPH scenarios. While analysis methods based on weighted logrank tests typically were favorable in terms of power, they do not provide an easily interpretable treatment effect estimate. Also, depending on the weight function, they test a narrow null hypothesis of equal hazard functions, and rejection of this null hypothesis may not allow for a direct conclusion of treatment benefit in terms of the survival function. In contrast, non‐parametric procedures based on well‐interpretable measures like the RMST difference had lower power in most scenarios. Model‐based methods based on specific survival distributions had larger power; however, often gave biased estimates and lower than nominal confidence interval coverage. The application of the studied methods is illustrated in a case study with reconstructed data from a phase III oncologic trial.
    Date 2025-02-28
    Language en
    Library Catalog DOI.org (Crossref)
    URL https://onlinelibrary.wiley.com/doi/10.1002/sim.70019
    Accessed 3/6/2025, 2:09:37 PM
    Volume 44
    Pages e70019
    Publication Statistics in Medicine
    DOI 10.1002/sim.70019
    Issue 5
    Journal Abbr Statistics in Medicine
    ISSN 0277-6715, 1097-0258
    Date Added 3/6/2025, 2:09:37 PM
    Modified 3/6/2025, 2:10:59 PM

    Tags:

    • non-ph
    • RMST

    Notes:

    • DIfferences in nonparametric RMST had low power, parametric estimates had greater power but need assumptions to hold