16 Caching
The workhorse behind Rmarkdown
and Quarto
(besides Pandoc
) is knitr, which processes the code chunks and properly mingles code and tabular and graphical output. knitr
has a built-in caching mechanism to make it so that code is not needlessly executed when the code inputs have not changed. This easy-to-use process does have two disadvantages: the dependencies are not transparent, and the stored cache files may be quite large. I like to take control of caching. To that end, the runifChanged function was written. Here is an example of its use. First a function with no arguments must be composed. This is the (usually slow) function that will be conditionally run if any of a group of listed objects has changed since the last time it was run. This function when needed to be run produces an object that is stored in binary form in a user-specified file (the default file name is the name of the current R code chunk with .rds
appended).
require(rms)
require(data.table)
# Read the source code for the hashCheck and runifChanged functions from
# https://github.com/harrelfe/rscripts/blob/master/hashCheck.r
getRs('hashCheck.r')
<- function() {
g # Fit a logistic regression model and bootstrap it 500 times, saving
# the matrix of bootstrapped coefficients
<- lrm(y ~ x1 + x2, x=TRUE, y=TRUE, data=dat)
f bootcov(f, B=500)
}set.seed(3)
<- 2000
n <- data.table(x1=runif(n), x2=runif(n),
dat y=sample(0:1, n, replace=TRUE))
# runifChanged will write runifch.rds if needed (chunk name.rds)
# Will run if dat or source code for lrm or bootcov change
<- runifChanged(g, dat, lrm, bootcov)
b dim(b$boot.Coef)
[1] 500 3
head(b$boot.Coef)
Intercept x1 x2
[1,] 0.02007292 -0.30079958 0.32416398
[2,] 0.06150624 -0.35741054 0.25522669
[3,] 0.25225861 -0.40094541 0.09290729
[4,] 0.13766665 -0.48661991 0.19684403
[5,] -0.22018456 0.02132711 0.33973578
[6,] 0.18217417 -0.36140896 -0.04873320