R rms
Package
Regression Modeling Strategies
News 2025-01-14
rms 7.0
is a milestone release of the package, now in its 34\(^{th}\) year, with greatly improved fitting functions and key new statistical analysis capabilities. Many of the computational improvements related to logistic and ordinal regression are detailed here. A list of all the major user-visible improvements follow.
- Semiparametric ordinal regression modeling using the
lrm
andorm
function now has no limitations on the number of intercepts in the model, so ordinal models are now even more appropriate for continuous Y. For example, a model fitted to N=300,000 with continuous Y (no ties), so that there are 299,999 intercepts in the model, fits in 2.5 seconds with 20 covariates. This is achievable when a Newton-type fitting algorithm is being used (Newton-Raphson or Levenberg-Marquardt). The standard error of one of the parameter estimates may be computed in 0.3s. This is due to the following point. - The
lrm
andorm
functions now both optimally use sparse matrices, taking full advance of the incredibly fast and comprehensiveMatrix
package, not only to compute final variances and covariances of parameter estimates, but also to do Newton-type parameter estimate updating while solving for maximum likelihood estimates. The information matrix is stored in a list of 3 submatrices that are as small as possible. The covariance matrix is not computed for the model fits but is computed on demand by thevcov
function. Needed portions of the covariance matrix for the parameters (inverse of the information matrix) can be computed in a fraction of a second even for a 300000 \(\times\) 300000 matrix. rms
has a new functioninfoMxop
(information matrix operations) that facilitates inverting regular and sparse information matrices, obtaining parts of the inverse, and computing matrix products of the inverse and a user-specified matrix, for Newton updating, getting standard errors of predicted values, etc.infoMxop
is used byvcov.lrm
andvcov.orm
.- Likelihood calculations for
orm
, like has always been the case forlrm
, are all done in Fortran for speed. Link functions are now hard-coded in Fortran, so users may not specify customized link functions. The author is glad to add new link functions on demand. orm
has an option to use Levenberg-Marquart optimization in addition to Newton-Raphson with step-halving.lrm
implements many different optimizers including hessian-free ones.orm
now supports weights and penaltiesMean
,Quantile
, andExProb
function generators: now use fast sparse matrix operations to more quickly use the \(\delta\) method to get variances of estimated means, quantiles, and exceedance probabilities from ordinal odelsbootcov
: changed way of handling non-sampled ordinal Y levels to use linear interpolation/extrapolation of intercepts. But it may be better to use the newHmisc
packageordGroupBoot
function to mildly bin Y so that no bootstrap sample will omit any distinct Y data values in the first place.contrast.rms
: implementedconf.type='profile'
to compute likelihood profile confidence intervals for general contrasts, and corresponding likelihood ratio \(\chi^2\) tests.
Earlier Updates
rms 6.8-0
has a non-downward-compatible change to the orm
function that improves how unique numeric values are determined for dependent variables. Previous versions could give different results on different hardward due to behavior of the R unique
function for floating point vectors. Now unique values are determined by the y.precison
argument which defaults to multiplying values by \(10^5\) before rounding. Details are in this report by Shawn Garbett of the Vanderbilt Department of Biostatistics.
Version 6.8-0 also has an important new function for relative explained variation, rexVar
.
rms 6.7-0
appeared on CRAN 2023-05-08 and represents a major update. The most significant new feature is automatically computing all likelihood ratio (LR) \(\chi^2\) chunk test statistics that can be inferred from the model design when the model is fitted using lrm, orm, psm, cph, Glm
. I’ve been meaning to do this for more than 10 years because LR tests are more accurate than the default anova.rms
Wald tests. LR tests do not suffer from the Hauck-Donner effect when a predictor has an infinite regression coefficient that drives the Wald \(\chi^2\) to zero because the standard error blows up.
An example of a full LR anova
is here.
Also new is the implementation of LR tests when doing multiple imputation, using the method of Chan and Meng. This uses a new feature in Hmisc:fit.mult.impute
where besides testing on individual completed datasets the log likelihood is computed from a stacked dataset of all completed datasets. Specifying lrt=TRUE
to fit.mult.impute
will take the necessary actions to get LR tests with processMI
including setting argument method
to 'stack'
which makes final regression coefficient estimates come from a single fit of a stacked dataset.
There are new rms
functions or options relating to this:
LRupdate
: update LR test-related stats afterprocessMI
is run (including pseudo \(R^2\) measures)processMI.fit.mult.impute
: added processing ofanova
result fromfit.mult.impute(..., lrt=TRUE)
prmiInfo
: print (or html) inputation parameters on the result ofprocessMI(..., 'anova')
This new rms
requires installing the latest Hmisc
from CRAN.
Documentation | CRAN | GitHub | Online
- Examples in an R markdown/knitr html report
- Vignette for general multiparameter transformations using the
gTrans
function - Vignettes for Bayesian modeling with rmsb
- An Introduction to the Harrellverse by Nicholas Ollberding
- Linear Regression Case Study by Thomas Love
- Markov models for longitudinal data, here, here, and here
- Many test scripts
- Video demonstrating
survplotp
interactive survival curves - Online help with examples
- Changelog and News
- Package overview
- Manual
- Latest source package
- To install: Download and
sudo R CMD INSTALL rms_current.tar.gz
- To install: Download and
- Latest binary packages for Linux, Windows, and Mac arm64 and here
- Notes about R^2 measures
Evolution
rms
is an R package that is a replacement for the Design
package. The package accompanies FE Harrell’s book Regression Modeling Strategies. It began in 1991 as the S-Plus Design
package.
Bug Reports
Please use Issues
on GitHub.