List of Figures

Figure Short Caption
Figure 4.1 Density and cumulative distribution functions
Figure 4.2 Symmetric continuous distribution
Figure 4.3 Bimodal distribution
Figure 4.4 Count variable with clumping at zero
Figure 4.5 Ordinal variable with strange distribution
Figure 4.6 Continuous distribution with clumping at the end
Figure 4.7 Spaghetti plot
Figure 4.8 Frequency dot chart
Figure 4.9 Dot chart for categorical demographic variables, stratified by treatment and region
Figure 4.10 Dot chart showing proportion of subjects having adverse events by treatment, sorted by risk difference, produced by the R greport package. See test.Rnw here
?fig-descript-ph Scatterplot of one measurement mode against another
Figure 4.13 Hexagonal binning replacing scatterplot for large \(n\)
Figure 4.14 Binned points (2500 total bins) with frequency counts shown as color and transparency level
Figure 4.15 Empirical cumulative distribution functions
Figure 4.16 Box plots for glycohemoglobin
Figure 4.17 Schematic for extended box plot
Figure 4.18 Extended box plots
Figure 4.19 Interactive extended box plot
Figure 4.20 One-half violin plots for longitudinal data, stratified by treatment. Density estimates for groups with insufficient sample sizes are faded. Density plots are back-to-back for treatment A and B. Points are treatment medians. When the black vertical line does not touch the two medians, the medians are significantly different at the \(\alpha=0.05\) level. Graphic was produced by the R greport package.
Figure 4.21 Moving upper quartile and Gini mean difference for HbA\(_{\mathrm 1c}\)
Figure 4.22 Moving \(Q_3\) vs. age with extended box plot and histogram
Figure 4.23 Showing group means and differences
Figure 4.24 Bar plot with error bars—“dynamite plot”
Figure 4.25 Dot plot with superimposed box plots
Figure 4.26 Jittered raw data and violin plots with median indicated by blue +
Figure 4.27 Single-axis nomogram
Figure 4.28 Partial effects in NHANES HbA\(_{1c}\) model
Figure 4.29 Partial effects chart for transformed glycohemoglobin
Figure 4.30 Nomogram for predicting median HbA\(_{1c}\)
Figure 4.31 Estimated median survival time for critically ill adults
Figure 4.32 Probability of hemorrhagic stroke vs. blood pressures
Figure 5.1 \(t\) distribution for varying d.f.
Figure 5.2 Posterior distributions for \(\mu\) and \(\sigma\) using a normal model.
Figure 5.3 Posterior distributions for \(\mu, \sigma, \nu\) for a \(t_{\nu}\) data model
Figure 5.4 Effect of discounting by a skeptical prior
Figure 5.5 Prior and posterior distributions for unknown probability of heads
Figure 5.6 Half-widths of 0.95 credible intervals for \(p\) using a flat prior
Figure 5.7 Multiplicative margin of error in estimating odds when \(n=384\) and the margin of error in estimating the absolute probability is \(\leq 0.05\).
Figure 5.8 Two-sample parallel group RCT
Figure 5.9 Stratified ECDFs for checking \(t\)-test assumptions
Figure 5.10 Margin of error in estimating \(\sigma\)
Figure 5.11 Data and box plots for paired data
Figure 6.1 Multiplicative margin of error for odds ratios
Figure 6.2 The logistic function
Figure 6.3 Posterior distribution of odds ratio
Figure 7.1 Fecal calprotectin by severity
Figure 7.2 Ranks of calprotectin
Figure 7.3 Wilcoxon \(P\)-value vs. hypothesized difference.
Figure 7.4 Bootstrapped differences in medians
Figure 7.5 Checking Wilcoxon assumption
Figure 7.6 Checking \(t\)-test assumption
Figure 7.7 logit ECDF plots for checking the PO assumption (parallelism) in the 2-factor problem
Figure 7.8 Relationship between odds ratio and assumed mean in the experimental arm
Figure 7.9 Departure from normality induced by assuming proportional odds
Figure 7.10 Assessing normality of experimental arm
Figure 7.11 Discrete distribution for the experimental arm
Figure 7.12 Margin of error for an ECDF based on Kolmogorov-Smirnov critical values
Figure 8.1 Example correlation coefficients
Figure 8.2 Multiple datasets having same Pearson \(r\)
Figure 8.3 Bland-Altman plot for 2 pH measurements
Figure 8.4 Difference in pH by time of day
Figure 8.5 Margin of error for estimating correlation coefficient
Figure 8.6 Sample size required to ensure a high probability of the sample correlation coefficient being in the right direction
Figure 8.7 Probability of having the correct sign on \(r\) as a function of \(n\) and true correlation \(\rho\)
Figure 8.8 loess nonparametric smoother for glucose ratio
Figure 8.9 Example of super smoother
Figure 8.10 Moving statistics for glucose ratio
Figure 8.11 Moving proportions
Figure 8.12 Moving 6m and 12m mortality estimates
Figure 10.1 Sample of \(n=100\) points with a linear regression line
Figure 10.2 Two types of confidence bands
Figure 10.3 Four examples of residual plots
Figure 10.4 Harm of percentiling BMI in a regression model
Figure 10.5 What are quintile numbers modeling?
Figure 10.6 Assumptions for two predictors
?fig-reg-olslead Mean maxfwt by information-losing lead exposure groups
Figure 11.1 Age of first walking
Figure 13.1 Distribution of baseline risk in GUSTO-I
Figure 13.2 A display of an interaction between treatment, extent of disease, and calendar year of start of treatment (Califf et al. (1989))
Figure 13.3 Estimates from model with age \(\times\) treatment interaction
Figure 13.4 Effects of predictors in Cox model
Figure 13.5 Estimates from a Cox model allowing treatment to interact with both age and sex
Figure 13.6 Absolute risk increase as a function of risk
Figure 13.7 Absolute risk reduction by background risk and interacting factor
Figure 13.8 Absolute benefit vs. baseline risk
Figure 13.9 GUSTO-I nomogram
Figure 13.10 Baseline risk, hazard ratio, and absolute effect
Figure 13.11 Distribution of cost per life saved in GUSTO–I
Figure 14.1 \(\beta\)-TG levels by diabetic status
Figure 14.2 8w vs. baseline Hamilton-D depression scores along with loess nonparametric smoothers by treatment
Figure 14.3 Change from baseline at 8w vs. baseline Hamilton-D depression scores and loess nonparametric smoothers by treatment
Figure 14.4 Estimated mean Hamilton-D score at 8w using the proportional odds model, allowing for nonlinear and interaction effects. The fitted spline function does more smoothing than the earlier loess estimates.
Figure 14.5 Estimated treatment difference in mean Ham-D allowing for interaction, thus allowing for non-constant differences. Keep in mind the slight evidence for non-constancy (interaction).
?fig-change-suppcr Hospital death as a function of creatinine
Figure 14.8 Baseball batting averages and regression to the mean
Figure 15.1 Spaghetti plots for isoproterenol data
Figure 15.2 AUC by curve fitting and by trapezoidal rule
Figure 15.3 Mean blood flow by race
Figure 15.4 Residual plot for generalized least squares fit on untransformed fbf
Figure 15.5 Checking assumptions of GLS model
?fig-serial-blrm MCMC diagnostics
Figure 15.10 Posterior mean logit
Figure 15.11 Posterior mean for mean blood flow
Figure 18.1 Risk of pneumonia with two predictors
Figure 18.2 Two kinds of thresholds
Figure 18.3 Thresholds in cardiac biomarkers
Figure 18.4 Power loss from dichotomizing the response variable
Figure 18.5 Continuous PSA vs. risk
Figure 18.6 Prognostic spectrum from various models
Figure 19.1 Nomogram for estimating P(significant coronary disease)
Figure 19.2 Nomogram for estimating probability of meningitis
Figure 19.3 Proportional odds ordinal logistic model for ordinal diagnostic classes from Brazer et al. (1991)
Figure 19.4 Pre vs. post-test probability
Figure 19.5 Relative effect of total cholesterol for age 40 and 70; Data from Duke Cardiovascular Disease Databank, \(n=2258\)
Figure 19.6 Diagnostic Utility of Cholesterol for Diagnosing Significant CAD. Curves are 0.1 and 0.9 quantiles from quantile regression using restricted cubic splines
Figure 20.1 Average of maximum absolute correlation coefficients