12  Statistical Inference Review

12.1 Types of Analyses

  • Except for one-sample tests, all tests can be thought of as testing for an association between at least one variable with at least one other variable
  • Testing for group differences is the same as testing for association between group and response
  • Testing for association between two continuous variables can be done using correlation (especially for unadjusted analysis) or regression methods; in simple cases the two are equivalent
  • Testing for association between group and outcome, when there are more than 2 groups which are not in some solid order1 means comparing a summary of the response between \(k\) groups, sometimes in a pairwise fashion
  • 1 The dose of a drug or the severity of pain are examples of ordered variables.

  • 12.2 Covariable-Unadjusted Analyses

    Appropriate when

    • Only interested in assessing the relationship between a single \(X\) and the response, or
    • Treatments are randomized and there are no strong prognostic factors that are measureable
    • Study is observational and variables capturing confounding are unavailable (place strong caveats in the paper)

    See sec-ancova

    12.2.1 Analyzing Paired Responses

    Type of Response Recommended Test Most Frequent Test
    binary McNemar McNemar
    continuous Wilcoxon signed-rank paired \(t\)-test

    12.2.2 Comparing Two Groups

    Type of Response Recommended Test Most Frequent Test
    binary \(2\times 2~\chi^{2}\) \(\chi^{2}\), Fisher’s exact test
    ordinal Wilcoxon 2-sample Wilcoxon 2-sample
    continuous Wilcoxon 2-sample 2-sample \(t\)-test
    time to event2 Cox model3 log-rank4
  • 2 The response variable may be right-censored, which happens if the subject ceased being followed before having the event. The value of the response variable, for example, for a subject followed 2 years without having the event is 2+.

  • 3 If the treatment is expected to have more early effect with the effect lessening over time, an accelerated failure time model such as the lognormal model is recommended.

  • 4 The log-rank is a special case of the Cox model. The Cox model provides slightly more accurate \(P\)-values than the \(\chi^2\) statistic from the log-rank test.

  • 12.2.3 Comparing \(>2\) Groups

    Type of Response Recommended Test Most Frequent Test
    binary \(r\times 2~\chi^{2}\) \(\chi^{2}\), Fisher’s exact test
    ordinal Kruskal-Wallis Kruskal-Wallis
    continuous Kruskal-Wallis ANOVA
    time to event Cox model log-rank

    12.2.4 Correlating Two Continuous Variables

    Recommended: Spearman \(\rho\)
    Most frequently seen: Pearson \(r\)

    12.3 Covariable-Adjusted Analyses

    • To adjust for imbalances in prognostic factors in an observational study or for strong patient heterogeneity in a randomized study
    • Analysis of covariance is preferred over stratification, especially if continuous adjustment variables are present or there are many adjustment variables
      • Continuous response: multiple linear regression with appropriate transformation of \(Y\)
      • Binary response: binary logistic regression model
      • Ordinal response: proportional odds ordinal logistic regression model
      • Time to event response, possibly right-censored:
        • chronic disease: Cox proportional hazards model
        • acute disease: accelerated failure time model