Note that the S datasets are called ABM internally so as to not cause confusion with the variable abm.
These are data on 581 patients having either acute viral (abm=0) or acute bacterial (abm=1) meningitis, from a study done at Duke University Medical Center that was published in Spanos A, Harrell FE, Durack DT (1989): Differential diagnosis of acute meningitis: An analysis of the predictive value of initial observations. JAMA 262: 2700-2707. Note that this is the complete dataset, not the subset of observations having complete data on key variables that was used to fit the multivariable model in the article. Expressions for computing key derived variables are stored as an attribute named derived on the data frame, and a vector of names of variables used in the final model in the article are contained in an attribute named main.analysis.variables. To create the derived variables do something such as:
attach(ABM) eval(attr(ABM,'derived'))
To just list the formulas for derived variables type
attr(ABM, 'derived').
If you want to use CSF/blood glucose ratio as a variable, you will need to create this derived variable before fitting models as this variable is derived from more than one input variable (this will also allow multiple imputation on this derived variable). For derived variables involving only a single input variable, it is best to derive them during the model fit. Here is an example
# Function to compute no. months from peak of summer dsummer <- function(x) pmin(abs(x-8), abs(x+12-8)) # Function to compute cube root cr <- function(x) x^(1/3) f <- lrm(abm ~ dsummer(month) + rcs(cr(wbc), 4) + rcs(log(gl+1), 5))
Sometimes it is a good idea to create multiple imputations using only basic variables. Then derived variables need to be recomputed for each imputation. This can be done using the Hmisc fit.mult.impute function's derived parameter. If you are using only the pre-defined derived variables you can specify derived=attr(ABM, 'derived') to fit.mult.impute. Note that if a variable is derived from a single variable through the use of a function such as the dsummer function above, that derived variable does not need to be defined to derived= in the call to fit.mult.impute. But if you compute a variable named dsummer as is done in attr(ABM,'derived') you do need to specify the expression in fit.mult.impute(..., derived=...).
In the modeling that was done in the article, the gram smear result was not used in deriving the model, but if the gram smear was known and positive (gram > 0), the predicted Prob(abm) was overridden to 1.0.