General Multi-parameter Transformations in `rms`

Department of Biostatistics
Vanderbilt University School of Medicine

November 19, 2024

As of version 6.2-0 of the rms package appearing in CRAN on 2021-03-18, there is a new transformation function gTrans, joining lsp, rcs, pol, etc. gTrans takes a numeric, character, or factor vector as its first argument and a function as its second argument. The latter can be any function that creates a numeric matrix. Give the columns of the matrix names if you don’t want to use default names. If you want to take advantage of automatic tests of nonlinearity when running rms::anova.rms (e.g., by issuing anova(fit)), give the returned matrix an attribute nonlinear which is a vector of integers specifying which columns correpond to nonlinear terms. Columns default to being considered as linear terms. If you want the matrix you generate to have a linear term it is up to you to add a column containing the original (numeric) vector.

gTrans is quite general. With it you can do things such as the following:

use a linear spline basis different from the one lsp uses
use a cubic spline basis different from the one rcs uses
use a quadratic spline
specify a linear, quadratic, or spline function with discontinuities
implement non-spline bases such as fractional polynomials
add cyclic trends to the regression
pool certain categories of a categorical predictor

Examples where you might want to fit a discontinuous function are interrupted time series and allowing for special effects at zero. As an example of the latter, suppose that a predictor is the number of days that a patient has been in the hospital in the past year. Zero has a special meaning, so one may wish to allow for a spike at zero by using gTrans(x, function(x) cbind(x == 0, sqrt(x), x)). This fits the following model:

\[Y = \beta_{0} + \beta_{1}[x = 0] + \beta_{2}\sqrt{x} + \beta_{3}x\] The reason to use gTrans instead of creating a separate model term for the discontinuity is that all the predictor’s basis functions remain connected. This is important for several reasons, including the following.

When studying the effect of varying \(x\), the terms will be forced to vary together. This will force, for example, \(\beta_1\) to be ignored when \(x > 0\) where one is studying the effect of \(x\) on the continuous part of the model. This applies to partial effect plots, odds ratio charts, nomograms, etc.
When using anova(fit) to do a “chunk” test for whether \(x\) is associated with \(Y\), the 1 d.f. for the discontinuous term will be added to the 2 d.f. for \(\beta_2\) and \(\beta_3\) to get the correct 3 d.f. test for flatness (no association).
anova can also provide chunk tests of nonlinearity when you keep the terms together.

Here are some examples with graphical output. Start by simulating some random data.

require(rms)
require(ggplot2)
# knitrSet(lang='markdown', fig.path='png/')
options(prType='html')

n <- 40
set.seed(1)
y  <- runif(n)
x1 <- runif(n)
x2 <- rnorm(n)
g  <- sample(letters[1:4], n, TRUE)
# We usually plot predicted values from the 10th smallest to the 10th largest observations
# Instead for demonstrating fits we plot over the whole range
dd <- datadist(x1, x2, g, q.display=c(0,1)); options(datadist='dd')

Quadratic Fit Duplicating `pol(x, 2)`

f <- ols(y ~ pol(x1, 2) * g + x2)
pol2 <- function(x) {
  # Give a custom name xsq for one column
  z <- cbind(x, xsq=x^2)
  attr(z, 'nonlinear') <- 2
  z
}
h <- ols(y ~ gTrans(x1, pol2) * g + x2)
specs(h, long=TRUE)

ols(formula = y ~ gTrans(x1, pol2) * g + x2)

       Assumption  Parameters                 d.f.
x1     gTrans                                 2   
g      category     a b c d                   3   
x2     asis                                   1   
x1 * g interaction nonlinear x linear - f(A)B 6   

                       x1    g          x2
Low:effect      0.3377484 <NA> -0.59514731
Adjust to       0.4985898    c  0.05117174
High:effect     0.7787237 <NA>  0.57997131
Low:prediction  0.0233312    a -1.80495863
High:prediction 0.9606180    d  2.40161776
Low             0.0233312    a -1.80495863
High            0.9606180    d  2.40161776

rbind(coef(f), coef(h))

     Intercept        x1     x1^2      g=b       g=c        g=d         x2
[1,] 0.7901302 -1.227246 1.002612 0.250034 -1.225043 -0.0724312 0.02494889
[2,] 0.7901302 -1.227246 1.002612 0.250034 -1.225043 -0.0724312 0.02494889
       x1 * g=b x1^2 * g=b x1 * g=c x1^2 * g=c   x1 * g=d x1^2 * g=d
[1,] -0.1810583 0.06080463  4.38774  -3.491963 -0.1317308  0.5119778
[2,] -0.1810583 0.06080463  4.38774  -3.491963 -0.1317308  0.5119778

anova(f)

Analysis of Variance for `y`
	d.f.	Partial SS	MS	F	P
x1 (Factor+Higher Order Factors)	8	0.4691615	0.05864518	0.69	0.6936
All Interactions	6	0.2900861	0.04834768	0.57	0.7489
Nonlinear (Factor+Higher Order Factors)	4	0.2503327	0.06258318	0.74	0.5727
g (Factor+Higher Order Factors)	9	0.5117214	0.05685793	0.67	0.7260
All Interactions	6	0.2900861	0.04834768	0.57	0.7489
x2	1	0.0125370	0.01253700	0.15	0.7031
x1 × g (Factor+Higher Order Factors)	6	0.2900861	0.04834768	0.57	0.7489
Nonlinear	3	0.1705474	0.05684914	0.67	0.5763
Nonlinear Interaction : f(A,B) vs. AB	3	0.1705474	0.05684914	0.67	0.5763
TOTAL NONLINEAR	4	0.2503327	0.06258318	0.74	0.5727
TOTAL NONLINEAR + INTERACTION	7	0.4666141	0.06665916	0.79	0.6029
REGRESSION	12	0.7834533	0.06528777	0.77	0.6724
ERROR	27	2.2818581	0.08451326

anova(h)

Analysis of Variance for `y`
	d.f.	Partial SS	MS	F	P
x1 (Factor+Higher Order Factors)	8	0.4691615	0.05864518	0.69	0.6936
All Interactions	6	0.2900861	0.04834768	0.57	0.7489
Nonlinear (Factor+Higher Order Factors)	4	0.2503327	0.06258318	0.74	0.5727
g (Factor+Higher Order Factors)	9	0.5117214	0.05685793	0.67	0.7260
All Interactions	6	0.2900861	0.04834768	0.57	0.7489
x2	1	0.0125370	0.01253700	0.15	0.7031
x1 × g (Factor+Higher Order Factors)	6	0.2900861	0.04834768	0.57	0.7489
Nonlinear	3	0.1705474	0.05684914	0.67	0.5763
Nonlinear Interaction : f(A,B) vs. AB	3	0.1705474	0.05684914	0.67	0.5763
TOTAL NONLINEAR	4	0.2503327	0.06258318	0.74	0.5727
TOTAL NONLINEAR + INTERACTION	7	0.4666141	0.06665916	0.79	0.6029
REGRESSION	12	0.7834533	0.06528777	0.77	0.6724
ERROR	27	2.2818581	0.08451326

summary(f)

Effects Response: `y`
	Low	High	Δ	Effect	S.E.	Lower 0.95	Upper 0.95
x1	0.3377	0.7787	0.441	0.16810	0.16690	-0.1743	0.5105
x2	-0.5951	0.5800	1.175	0.02932	0.07612	-0.1269	0.1855
g --- a:c	3.0000	1.0000		-0.09457	0.19360	-0.4919	0.3028
g --- b:c	3.0000	2.0000		0.08031	0.20190	-0.3339	0.4945
g --- d:c	3.0000	4.0000		-0.10540	0.20090	-0.5177	0.3069

summary(h)

Effects Response: `y`
	Low	High	Δ	Effect	S.E.	Lower 0.95	Upper 0.95
x1	0.3377	0.7787	0.441	0.16810	0.16690	-0.1743	0.5105
x2	-0.5951	0.5800	1.175	0.02932	0.07612	-0.1269	0.1855
g --- a:c	3.0000	1.0000		-0.09457	0.19360	-0.4919	0.3028
g --- b:c	3.0000	2.0000		0.08031	0.20190	-0.3339	0.4945
g --- d:c	3.0000	4.0000		-0.10540	0.20090	-0.5177	0.3069

ggplot(Predict(f))

ggplot(Predict(h))

k1 <- list(x1=c(.2, .4), g='b')
k2 <- list(x1=c(.2, .4), g='d')
contrast(f, k1)

   x1 g         x2    Contrast      S.E.      Lower     Upper     t Pr(>|t|)
1 0.2 b 0.05117174  0.01218642 0.3892210 -0.7864292 0.8108020  0.03   0.9753
2 0.4 b 0.05117174 -0.14186453 0.3835685 -0.9288821 0.6451531 -0.37   0.7144

Error d.f.= 27 

Confidence intervals are 0.95 individual intervals

contrast(h, k1)

   x1 g         x2    Contrast      S.E.      Lower     Upper     t Pr(>|t|)
1 0.2 b 0.05117174  0.01218642 0.3892210 -0.7864292 0.8108020  0.03   0.9753
2 0.4 b 0.05117174 -0.14186453 0.3835685 -0.9288821 0.6451531 -0.37   0.7144

Error d.f.= 27 

Confidence intervals are 0.95 individual intervals

contrast(f, k1, k2)

   x1         x2  Contrast      S.E.      Lower     Upper    t Pr(>|t|)
1 0.2 0.05117174 0.2945528 0.2229484 -0.1628996 0.7520051 1.32   0.1975
2 0.4 0.05117174 0.2305465 0.2028467 -0.1856605 0.6467535 1.14   0.2657

Error d.f.= 27 

Confidence intervals are 0.95 individual intervals

contrast(h, k1, k2)

   x1         x2  Contrast      S.E.      Lower     Upper    t Pr(>|t|)
1 0.2 0.05117174 0.2945528 0.2229484 -0.1628996 0.7520051 1.32   0.1975
2 0.4 0.05117174 0.2305465 0.2028467 -0.1856605 0.6467535 1.14   0.2657

Error d.f.= 27 

Confidence intervals are 0.95 individual intervals

Linear Spline Fit Duplicating `lsp`

f <- ols(y ~ lsp(x1, c(.2, .4)))
# Give custom names for columns
lspline <- function(x) {
  z <- cbind(x, x.2=pmax(x - .2, 0), x.4=pmax(x - .4, 0))
  attr(z, 'nonlinear') <- 2:3
  z
}
h <- ols(y ~ gTrans(x1, lspline))
rbind(coef(f), coef(h))

     Intercept        x1     x1'      x1''
[1,]  1.051559 -3.999801 5.19094 -1.093075
[2,]  1.051559 -3.999801 5.19094 -1.093075

ggplot(Predict(f))

ggplot(Predict(h))

anova(f)

Analysis of Variance for `y`
	d.f.	Partial SS	MS	F	P
x1	3	0.4889819	0.16299396	2.28	0.0962
Nonlinear	2	0.4808427	0.24042137	3.36	0.0459
REGRESSION	3	0.4889819	0.16299396	2.28	0.0962
ERROR	36	2.5763295	0.07156471

anova(h)

Analysis of Variance for `y`
	d.f.	Partial SS	MS	F	P
x1	3	0.4889819	0.16299396	2.28	0.0962
Nonlinear	2	0.4808427	0.24042137	3.36	0.0459
REGRESSION	3	0.4889819	0.16299396	2.28	0.0962
ERROR	36	2.5763295	0.07156471

Let’s use a different linear spline basis where the parameters refer to slopes and not to slope increments.

curtail <- function(x, a, b=Inf) pmax(pmin(x, b) - a, 0)
lspline <- function(x) cbind(curtail(x, 0, 0.2), curtail(x, 0.2, 0.4), curtail(x, 0.4))
h <- ols(y ~ gTrans(x1, lspline))
rbind(coef(f), coef(h))

     Intercept        x1     x1'        x1''
[1,]  1.051559 -3.999801 5.19094 -1.09307531
[2,]  1.051559 -3.999801 1.19114  0.09806455

ggplot(Predict(h))

More Exotic Linear Splines

Fit a straight line from x1=0.1 on, but force a flat relationship for x1 in [0, 0.1]. First do it forcing continuity at x1=0.1

h <- ols(y ~ pmax(x1, 0.1))
xseq <- c(0, 0.099, 1, 0.101, seq(0.2, .8, by=0.1))
ggplot(Predict(h, x1=xseq))

Now allow discontinuity without a slope change.

flin <- function(x) cbind(x < 0.1, x)
h <- ols(y ~ gTrans(x1, flin))
yl <- c(-0.25, 1.25)
ggplot(Predict(h, x1=xseq), ylim=yl) +
  geom_point(aes(x=x1, y=y), data=data.frame(x1, y))

Now have a discontinuity with a slope change.

flin <- function(x) cbind(x < 0.1, pmax(x - 0.1, 0))
h <- ols(y ~ gTrans(x1, flin))
ggplot(Predict(h, x1=xseq), ylim=yl) +
  geom_point(aes(x=x1, y=y), data=data.frame(x1, y))

Discontinuity with a slope change and initial slope not restricted to be zero.

flin <- function(x) cbind(x < 0.1, x, pmax(x - 0.1, 0))
h <- ols(y ~ gTrans(x1, flin))
ggplot(Predict(h, x1=xseq), ylim=yl) +
  geom_point(aes(x=x1, y=y), data=data.frame(x1, y))

Discontinuous linear spline with two knots:

dlsp <- function(x) {
  z <- cbind(x, x >= 0.2, pmax(x - .2, 0), pmax(x - .4, 0))
  attr(z, 'nonlinear') <- 2:4
  z
}
h <- ols(y ~ gTrans(x1, dlsp))
ggplot(Predict(h), ylim=yl)

ggplot(Predict(h, x1=c(.1, .199, .2, .201, .3, .4, 1)), ylim=yl)

Discontinuous linear spline with different basis functions:

dlsp <- function(x) {
  z <- cbind(x, x >= 0.2, curtail(x, 0.2, 0.4), curtail(x, 0.4))
  attr(z, 'nonlinear') <- 2:4
  z
}
h <- ols(y ~ gTrans(x1, dlsp))
ggplot(Predict(h), ylim=yl)

Categorical Predictor

# Try on a categorical predictor
gr <- function(x) cbind(bc=x %in% c('b','c'), d=x == 'd')
h <- ols(y ~ gTrans(g, gr))
ggplot(Predict(h, g))

Different Basis Functions for Cubic Splines

Next let’s use a B-spline basis for a restricted cubic spline function (natural spline).

require(splines)
f <- ols(y ~ rcs(x1, 5))
yl <- c(0, 1.25)
ggplot(Predict(f), ylim=yl)

h <- ols(y ~ gTrans(x1, function(x) ns(x, df=4)))
ggplot(Predict(h), ylim=yl)

mean(abs(Predict(f)$yhat - Predict(h)$yhat))

[1] 0.0354661

# Now try to specify the same knots for ns as used for rcs
w <- rcs(x1, 5)
kn <- attr(w, 'parms')
nspl <- function(x) ns(x, knots=kn[2:4], Boundary.knots=c(kn[1], kn[5]))
# Show the ns basis functions
i <- order(x1)
matplot(x1[i], nspl(x1)[i, ], type='l', xlab='x1', ylab='Natural Spline Bases')

h <- ols(y ~ gTrans(x1, nspl))
ggplot(Predict(h), ylim=yl)

mean(abs(Predict(f)$yhat - Predict(h)$yhat))

[1] 1.301181e-15

Now use the simple truncated power basis for the natural spline, but orthogonalize it by using principal components on centered, SD-scaled data.

# Use knots already computed and stored in kn above
# Compute PCs on rcs and store scaling, centering, and rotation objects
# Note that this is automatic if you specify options(rcspc=TRUE) although not certain that
# post-fitting predicted values will work as well as below
X <- rcspline.eval(x1, kn, inclx=TRUE)  # use low-level function that rcs calls
# Show usual basis functions
matplot(x1[i], X[i, ], type='l', xlab='x1', ylab='rcs Bases')

p <- prcomp(X, scale=TRUE)
matplot(x1[i], p$x[i, ], type='l', xlab='x1', ylab='PC Rotated rcs Bases')

prcs <- function(x, rotation, center, scale, knots) {
  # deparse inside gTrans does not preserve matrix arguments
  rotation <- matrix(rotation, nrow=length(center))
  X <- rcspline.eval(x, knots, inclx=TRUE)
  scale(X, center, scale) %*% rotation
}
# Define parameters inside new function so the function will be self contained
formals(prcs) <- list(x=NULL, rotation=p$rotation, center=p$center, scale=p$scale, knots=kn)
# Show fit with usual basis
f <- ols(y ~ rcs(x1, kn))
ggplot(Predict(f))

# Show fit with rotated basis having no collinearities
h <- ols(y ~ gTrans(x1, prcs))
ggplot(Predict(h))

mean(abs(Predict(f)$yhat - Predict(h)$yhat))

[1] 2.974648e-14

QR-rotations of truncated power basis functions:

# Use the Hmisc qrxcenter function to use the QR decomposition on X
w  <- qrxcenter(X)   # also use inside gTrans as in other examples
round(w$R,  2) # Transformation matrix; 1st column = linear = 3.83*x1

     [,1]  [,2]   [,3]    [,4]
[1,] 3.83 -7.11   9.96   -9.12
[2,] 0.00  8.19 -93.56  228.61
[3,] 0.00  0.00 142.17 -524.49
[4,] 0.00  0.00   0.00  451.46

Xt <- w$x
round(cor(Xt), 3)

     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    1    0    0
[3,]    0    0    1    0
[4,]    0    0    0    1

# Compare with high collinearities in the original bases
round(cor(X), 3)

      x                  
x 1.000 0.881 0.851 0.794
  0.881 1.000 0.997 0.977
  0.851 0.997 1.000 0.990
  0.794 0.977 0.990 1.000

matplot(x1[i], Xt[i, ], type='l', xlab='x1', ylab='QR Rotated rcs Bases')

Non-spline Basis

The gTrans framework easily allows for non-spline basis functions for modeling continuous predictors. For example, fractional polynomials can be used directly. Periodic functions may also be used if the cycle length is known. Using the data simulated above, here is an example of adding a short-cycle harmonic series to a linear spline. Here the cycle length is taken to be 0.2. By adding cosine terms one does not need to know the phase angle. A nice resource on adding cyclic trends to regression by Steve Simon is here.

To render the fitted equation in math notation using the latex method in rms requires one to specify a function that creates LaTeX notation for the columns of the matrix produced by gTrans. The sprintf function is most handy for this, as shown below.

# Define a function that will be used by the latex method to customize typesetting of hrm model components
# Argument x will contain the variable's base name (here x1 but in LaTeX notation x_{1} for the subscript)
# tex works only in rms version 6.2-1 and higher
texhrm <- function(x) sprintf(c('%s', '(%s - 0.5)_{+}',
                                '\\sin(2\\pi \\frac{%s}{0.2})', '\\cos(2\\pi \\frac{%s}{0.2})'), x)
hrm <- function(x) {
  z <- cbind(x, slopeChange=pmax(x - 0.5, 0), sin=sin(2*pi*x/0.2), cos=cos(2*pi*x/0.2))
  attr(z, 'nonlinear') <- 2:4
  attr(z, 'tex')       <- texhrm
  z
}
h <- ols(y ~ gTrans(x1, hrm))
h

Linear Regression Model

ols(formula = y ~ gTrans(x1, hrm))

	Model Likelihood Ratio Test	Discrimination Indexes
Obs 40	LR χ² 2.45	R² 0.059
σ 0.2870	d.f. 4	R²_adj -0.048
d.f. 35	Pr(>χ²) 0.6532	g 0.077

Residuals

     Min       1Q   Median       3Q      Max 
-0.43501 -0.19915 -0.04817  0.22900  0.56842

	β	S.E.	t	Pr(>\|t\|)
Intercept	0.7037	0.1536	4.58	<0.0001
x	-0.5302	0.4000	-1.33	0.1936
slopeChange	0.8325	0.6596	1.26	0.2153
sin	-0.0238	0.0623	-0.38	0.7048
cos	0.0407	0.0728	0.56	0.5800

latex(h)

\[\mathrm{E}(\mathrm{y}) = X\beta,~~\mathrm{where}\] \[\begin{array} \lefteqn{X\hat{\beta}=}\\ & & 0.7036882 -0.5301862 \:\mathrm{x_{1}}+0.8324556 (\mathrm{x_{1}} - 0.5)_{+}-0.02380026\sin(2\pi \frac{\mathrm{x_{1}}}{0.2})+0.04066578 \cos(2\pi \frac{\mathrm{x_{1}}}{0.2}) \\ \end{array}\]

ggplot(Predict(h)) + geom_point(aes(x=x1, y=y), data=data.frame(x1, y))

Computing Environment

 R version 4.4.2 (2024-10-31)
 Platform: aarch64-apple-darwin20
 Running under: macOS Sequoia 15.3.2
 
 Matrix products: default
 BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
 LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
 
 attached base packages:
 [1] splines   stats     graphics  grDevices utils     datasets  methods  
 [8] base     
 
 other attached packages:
 [1] ggplot2_3.5.1 rms_7.1-0     Hmisc_5.2-3

To cite R in publications use:

R Core Team (2024). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.

To cite the Hmisc package in publications use:

Harrell Jr F (2025). Hmisc: Harrell Miscellaneous. R package version 5.2-3, https://hbiostat.org/R/Hmisc/.

To cite the rms package in publications use:

Harrell Jr FE (2025). rms: Regression Modeling Strategies. R package version 7.1-0, https://github.com/harrelfe/rms, https://hbiostat.org/R/rms/.

To cite the ggplot2 package in publications use:

Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.

Quadratic Fit Duplicating pol(x, 2)

Linear Spline Fit Duplicating lsp