16 Caching
The workhorse behind Rmarkdown
and Quarto
(besides Pandoc
) is knitr, which processes the code chunks and properly mingles code and tabular and graphical output. knitr
has a built-in caching mechanism to make it so that code is not needlessly executed when the code inputs have not changed. This easy-to-use process does have two disadvantages: the dependencies are not transparent, and the stored cache files may be quite large. I like to take control of caching and to be able to read the stored results with other scripts. To that end, the Hmisc
package runifChanged
function was written. Here is an example of its use. First a function with no arguments must be composed. This is the (usually slow) function that will be conditionally run if any of a group of listed objects has changed since the last time it was run. This function when needed to be run produces an object that is stored in binary form in a user-specified file (the default file name is the name of the current R code chunk with .rds
appended).
require(rms)
require(data.table)
<- function() {
g # Fit a logistic regression model and bootstrap it 500 times, saving
# the matrix of bootstrapped coefficients
<- lrm(y ~ x1 + x2, x=TRUE, y=TRUE, data=dat)
f bootcov(f, B=500)
}set.seed(3)
<- 2000
n <- data.table(x1=runif(n), x2=runif(n),
dat y=sample(0:1, n, replace=TRUE))
# runifChanged will write runifch.rds if needed (chunk name.rds)
# Will run if dat or source code for lrm or bootcov change
<- runifChanged(g, dat, lrm, bootcov) b
Re-run because of changes in the following objects: bootcov
dim(b$boot.Coef)
[1] 500 3
head(b$boot.Coef)
Intercept x1 x2
[1,] 0.02506366 -0.26912787 0.22930212
[2,] -0.02513734 -0.06308701 0.23415528
[3,] 0.15264191 -0.51540301 0.27155256
[4,] 0.18871210 -0.16127618 -0.17868324
[5,] 0.06781028 0.03666227 0.04380128
[6,] -0.01370652 -0.40025695 0.34345943