R `Hmisc` Package

Published

October 20, 2024

News

Hmisc version 5.2-0 will appear on CRAN around 2024-10-24. The most signicant change is the addition of a highly efficient function for computing the pseudomedian, also known as the Hodges-Lehman one-sample estimator. It is robust and efficient and is defined as the median of all possible pairs of values (including pairing an observation with itself). The pseudomendian was also added to the output from the describe function. It appears under the label pMedian.

Hmisc version 5.1-1 appeared on CRAN on 2023-05-08 and represents a milestone in Hmisc history. Here are the most significant additions and enhancements. Of these, the describe and fit.mult.impute functions have the most entensive enhancements. describe’s print method can be used to make a new table format when continuous and categorical variables are printed separately. Interactive sparklines show category details, e.g., for spike histograms, hovering over a spike will show the bin interval, frequency count, and particular values in the bin if they are few in number. For examples see this and this.

Function	Purpose
`fit.mult.immpute`	Add robust cluster sandwich covariance estimation, added `method=` and stacking method to facilitate likelihood ratio tests with `rms::processMI`
`testCharDateTime`	New function to test character vectors for legal date/time/date-time majority of values
`describe`, `mChoice`	Improved output for multiple choice variables
`vlab,hlab,hlabs`	Fixed bug, improved logic, and also look in global environment for labels
`spikecomp`	New options that facilitate sparklines
`describe`	`print` method can use `gt` package and include interactive sparklines

Hmisc version 5.1-0 was completed on 2023-04-10 adds some major features above what is in 5.0+. The most significant additions are

fit.mult.impute implemented robust sandwich covariance estimates during multiple imputation
better handling of multiple choice mChoice variables in describe and summary
vlab, hlab, hlabs: fixed bug and improved logic, adding a search in the global environment
describe: added completely new print methods for separately printing categorical vs. continuous variables, and showing frequency distributions (spike histograms for continuous variables) using interactive sparklines by making use of the gt and sparkline packages, For examples see this where you will also see another new feature: for character variables with too many levels to tabulate, the lowest and highest alphabetic levels are listed, and the min/max/mean character width and mode category are reported.

Hmisc version 5.0-1 was completed 2023-03-05. The source version for a minor update to version 5.0-2 is available below, and binary versions are available for platforms other than the older Mac x86 hardware. Because of the number of new functions, version 5 represents the biggest update in the history of the package, which began in 1991. Another significant change is that Hmisc no longer loads other packages at startup.

The new functions are summarized below.

Function	Purpose
`rendHTML`	Render html text whether running interactively or when rendering a report
`princmp`	Help in interpreting principal components and sparse principal components
`getabd`	Fetch datasets from The Analysis of Biological Data
`runParallel`	Make the `parallel` package easy to use
`hashCheck`	Run `digest::digest` on a series of arguments to create a hash, fetch an existing result file which contains the hash of the input objects the last time an analysis was run, and to return the results stored in the file (an `.rds` file) if the hashes match, or NULL otherwise
`runifChanged`	Re-run code if an input changed, as judged by hashChech
`hlab`	Retrieve plotting-formatted variable label from a current dataset or from the object created by `extractlabs`, which takes priority
`hlabs`	Call `ggplot2` `labs()` after running variable names through `hlab()`
`vlab`	Like `hlab` but returns text string form of label/units
`extractlabs`	For \(\geq 1\) data frames/tables saves a data table of all variables that had a non-blank label or units attribute
`nCoincident`	Count the number of coincident x,y pairs that are likely to be hidden from view in a scatterplot
`meltData`	Take a formula and `melt` a data frame/table so that all right-hand-side formula variables are played against the left-hand side variable
`ebpcomp`	Compute coordinates of components of an extended box plot. Useful for adding layers to `ggplot2` graphs.
`spikecomp`	Compute coordinates of components of a spike histogram
`movStats`	General function for estimating the relationship between a continuous variable and a response, possibly stratified by another variable, using overlapping moving windows
`combine.levels`	Added `plevels` argument and implemented new capabilities for ordered factors, for which only consecutive levels are allowed to be combined; also added `m` argument for all situations
`completer`	Function by Yong-Hao Pua, Singapore General Hospital that facilitates drawing of multiple imputations to get one or more completed datasets
`ecdfSteps`	Compute coordinates of empirical CDF with possible domain extension
`fImport`	Front-end for `rio` package for general file import

Package Usage and Examples

R Workflow with many examples of Hmisc function usage
Examples in an Rmarkdown/knitr html document (produced using the readthedown style in the rmdformats package)
Script for examples
Same examples using the distill package, with script here
combplotp example
gbayesSeqSim example and code
Example simulations of Markov longitudinal ordinal data using the simMarkovOrd and related functions
Talk about newer graphics functions
summary* functions

Package Repositories and Updates

GitHub repository
CRAN
Reference manual
Online help with executable examples
Change log
Latest Linux source package
- To install: Download and sudo R CMD INSTALL Hmisc_current.tar.gz
Latest binary packages for Linux, Windows, and Mac arm64

Bug Reports

Please go to GitHub issues

Mac Issues

If you get

ld: warning: directory not found for option '-L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin18/8.2.0'
ld: warning: directory not found for option '-L/usr/local/gfortran/lib'
ld: library not found for -lgfortran

edit the following /Library/Frameworks/R.framework/Resources/etc by replacing the default (commented out line below) with the gcc directory location.

 # FLIBS =  -L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin18/8.2.0 -L/usr/local/gfortran/lib -lgfortran -lquadmath -lm

FLIBS =  -L/usr/local/lib/gcc/11/gcc/x86_64-apple-darwin20/11.1.0 -L/usr/local/lib/gcc/11 -lgfortran -lquadmath -lm

Thanks to John Graves, Vanderbilt University.

Page created 2004-02-15

News

Package Usage and Examples

Package Repositories and Updates

Related Material

Bug Reports

Mac Issues