Ordinal Regression
Ordinal Variables
- Continuous interval-scaled variables are ordinal
- Ordinal variables need not be interval-scaled
- Ordinal variables
- Ordinal outcomes in clinical trials and here
Big Picture
- For a homogeneous sample (no covariates \(X\)) of a continuous variable \(Y\) the ECDF is a complete summary of the sample
- \(F_n(y) = \frac{1}{n}\sum_{i=1}^{n}[Y_{i} \leq y]\)
- Estimates data generating distribution \(F(y)\)
- Handles extreme ties, floor/ceiling effects, bimodality, …
- Semiparametric ordinal regression models extend \(F_n(y)\) to incorporate \(X\)
- ECDF is encoded in the model’s intercepts
- Cox PH model: intercepts \(\equiv \log(-\log)\) underlying survival curve
- In general: intercepts = link function (e.g. logit) of ECDF when \(X=0\)
- Standard model form (parallelism): shift in \(X \rightarrow\) shift in \(\text{link}(F(y))\)
- Link function dictates how ECDF of individuals with different \(X\) are related
- Parametric models: parallelism + specific CDF shape
- Ordinal models can estimate effect ratios, \(\Pr(Y \geq y | X)\), quantiles of \(Y|X\), and (if \(Y\) is interval-scaled) \(E(Y | X)\)
- See this for sample size for estimating entire distribution
- Parallelism assumption (e.g. proportional odds/hazards) can be relaxed by having \(Y\)-dependent covariates
- Parameter estimates and inference are invariant to monotonic transformations of \(Y\)
- For Cox PH model transformations must be increasing
- The models work equally well for continuous as well as discrete \(Y\)
- The R
rms
packagelrm
andorm
functions can handle an unlimited number of intercepts as ofrms
version 7.0 - Example: \(n=300,000\), continuous \(Y\) with no ties, \(299,999\) intercepts, 20 covariates, fits in 2.5 seconds; needed pieces of the \(300,019\times 300,019\) covariance matrix can be calculated in 0.3 seconds
- Intercepts are order-restricted; can estimate more parameters than \(N\)
- See Liu et al for theoretical and simulation justification for continuous \(Y\)
- The R
- Ordinal or continuous \(Y\) values can be overridden by events
- log-rank test is a special case of the Cox PH model
Wilcoxon/Kruskal-Wallis tests are special cases of the proportional odds (PO) model- These rank tests assume more than their model counterparts
- Models extend to longitudinal data
- Markov process, mixed effects, or GEE
Example Proportional Odds Model: Discrete \(Y\)
- \(Y=0,1,2,3\) for pain levels of none, mild, moderate, severe
- \(X\) contains indicator variable for sex (0=female, 1=male) and treatment (0=control, 1=active)
\[\Pr(Y \geq y | X) = \text{expit}(\alpha_{y} + \beta_{1}[\text{male}] + \beta_{2}[\text{active}])\]
\[\text{expit}(z) = \frac{1}{1 + \exp(-z)}, \alpha_{1}=1, \alpha_{2}=0, \alpha_{3}=-1, \beta_{1}=-0.5, \beta_{2}=-0.4\]
- male:female OR for \(Y\geq y\) for any \(y\) is \(\exp(-0.5) = 0.61\)
- active:control OR \(= \exp(-0.4) = 0.67\).
- Probabilities of outcomes for a male on active treatment
- \(\beta\) part of the model is -0.9
- probabilities of outcomes of level \(y\) or worse:
\(y\) | Meaning | log odds(\(Y\geq y\)) | \(\Pr(Y\geq y)\) |
---|---|---|---|
1 | any pain | 0.1 | 0.52 |
2 | moderate or severe | -0.9 | 0.29 |
3 | severe | -1.9 | 0.13 |
- Pr(moderate pain) = 0.29 - 0.13 = 0.16
- Pr(pain free) = 1 - 0.52 = 0.48
- Model for continuous \(Y\) would look the same, just have many more \(\alpha\)s.
Binary vs. Ordinal Outcomes
Ordinal Regression
- A gentle introduction to the proportional odds model
- Checking assumptions of the Wilcoxon test
- Two-way ANOVA example
- Bayesian Wilcoxon test (Bayesian PO model)
- Digging into the PO model
- Equivalence of the Wilcoxon test and the PO model
- Detailed case study: ordinal regression for continuous Y
- Power and sample size calculations and here
- Ordinal models for paired data
Resources
R Packages
PPO: partial proportional odds model
CPPO: constrained partial PO model
D: discrete Y
C: continuous Y
k: maximum number of computationally feasible Y levels
Q: derived estimands such as mean and quantiles built-in
With or Without Random Effects
Without Random Effects
Compilation
- Home page and its pdf version
Statistical Thinking Blog Articles
These articles are in HTML format and can be viewed on any size device, but are not suitable for printing. Some of the HTML documents have interactive components. Two of the articles also have PDF versions for printing.
- Assessing the proportional odds assumption and its impact pdf; R script
- Equivalence of Wilcoxon test and proportional odds model pdf; R script
- Violation of proportional odds is not fatal
- If you like the Wilcoxon test you must like the proportional odds model
- Information gain from using ordinal instead of binary outcomes
- Using the partial PO model to borrow information across outcomes
- Ordinal models for paired data
- What does a statistical method assume?