4  Report Formatting

flowchart LR
Fig[Figures] --> Lay[Layout<br>Size]
Tab[HTML Tables]
Place[Placement] --> places[Margin<br>Tabs<br>Expand/Hide]
rept[reptools] --> rhf[Report Writing<br>Helper Functions]
OF[Overall Format] --> ofs[HTML]
OF --> ltx[LaTeX] --> pdf[pdf]
OF --> mfd[Multi-Format<br>Reports]
MD[Metadata<br>Report<br>Annotations] --> mds[Variable Labels<br>and Units]

A state-of-the-art way to make reproducible reports is to use a statistical computing language such as R and its knitr package in conjunction with either RMarkdown or Quarto, with the latter likely to replace the former. Both of the report-making systems allow one to produce reports in a variety of formats including html, pdf, and Word. Html is recommended because pages can be automatically resized to allow optimum viewing on devices of most sizes, and because html allows for interactive graphics and other interactive components. Pdf is produced by converting RMarkdown or Quarto-produced markdown elements to \(\LaTeX\).

Report formatting is very much enhanced by using variable attributes such as labels and units of measurement that are not considered in base R. Methods for better annotating output using labels and units are given below.

This document can serve as a template for using R with Quarto; one can see the raw script by clicking on Code at the top right of the report. When one has only one output format target, things are fairly straightforward except some situations where mixed formats are rendered in the same code chunk. Click below for details.

To make use of specialized functions that produce html or \(\LaTeX\) markup, one often has to put results='asis' in the code chunk header to keep the system from disturbing the generated html or \(\LaTeX\) markup so that it will be typeset correctly in the final document. This process works smoothly but creates one complication: if you print an object that produces plain text in the same code chunk, the system will try to typeset it in html or \(\LaTeX\). To prevent this from happening you either need to split the chunks into multiple chunks (some with results='asis' and some not) or you need to make it clear that parts of the output are to be typeset verbatim. To do that a simple function pr can sense if results='asis' is in effect for the current chunk. If so, the object is surrounded by the markdown verbatim indicator—three consecutive back ticks. If not the object is left alone. pr is defined in the marksupSpecs$markdown$pr object, so you can bring it to your session by copying into a local function pr as shown below, which has a chunk option results='asis' to show that verbatim output appears anyway. If the argument obj to pr is a data frame or data table, variables will be rounded to the value given in the argument dec (default dec=3) before printing. If you specify inline=x the object x is printed with cat() instead of print(). inline is more for printing character strings.

An example of something that may not render correctly due to results='asis' being in the chunk header (needed for html(...)):

f <- ols(y ~ rcs(x1, 5))
f    # prints model summary in html format
m <- matrix((1:10)/3, ncol=2)
# use pr(obj=m) to fix

Here are examples of pr usage.

pr <- markupSpecs$markdown$pr
x <- (1:5)/7
pr('x:', x)


[1] 0.1428571 0.2857143 0.4285714 0.5714286 0.7142857
[1] 0.1428571 0.2857143 0.4285714 0.5714286 0.7142857
pr(inline=paste(round(x,3), collapse=', '))

0.143, 0.286, 0.429, 0.571, 0.714

Instead of working to keep certain outputs verbatim you can use knitr::kable() to convert verbatim output to markdown. Also see the yaml df-print html option, for which you may want to set df-print: kable.

knitr/Quarto will by default print data frames and other simple tables using html. To make knitr using plain text printing, put this code at the top of the report to redefine the default knitr printing function.

knit_print <- knitr::normal_print

4.1 Quarto Syntax for Figures

One can specify sizes, layouts, captions, and more using Quarto markup. Captions are ignored unless a figure is given a label. Figure labels must begin with fig-. The figure can be cross-referenced elsewhere in the document using for example See \@fig-scatterplot. Figure will be placed in front of the figure number automatically. Here is example syntax.

#| label: fig-myplot
#| fig-cap: “An example caption (use one long line for caption)”
#| fig-height: 3
#| fig-width: 4
plot(1:7, abs(-3 : 3))

If the code produces multiple plots you can combine them into one with a single overall caption and include subcaptions for the individual panels:

#| label: fig-myplot
#| fig-cap: “Overall caption …”
#| fig-height: 3
#| fig-width: 4
#| layout-ncol: 2
#| fig-subcap:
#| - “Subcaption for panel (a)”
#| - “Subcaption for panel (b)”
plot(1:7, abs(-3 : 3))

To include an existing image while making use of Quarto for sizing and captioning etc. use this example.

```{r out.width=“600px”}
#| label: fig-mylabel
#| fig-cap: “…”

If you don’t need to caption or cross-reference the figure use e.g.

Other examples are in the next section.

The reptools repository has helper functions for building a table of figures. To use those, put addCap() or addCap(scap="short caption for figure") as the first line of code in the chunk. The full caption is taken as the fig-cap: markup. If you don’t specify scap too addCap the short caption will be taken as the fig-scap: markup, or if that is missing, the full caption. At the end of the report you can print the table of figures using the following syntax (but surround the last line with back ticks).

# Figures

r printCap()

For chunks having #| label: fig- you can automatically have knitr call addCap at the start of a chunk, extracting the needed information, if you run the reptools function hookaddcap() in a chunk before the first chunk that produced a graph. This procedure is used through this book. addCap makes use of fig-scap: for short captions.

4.2 Quarto Built-in Syntax for Enhancing R Output

Helper functions described below allow one to enhance graphical and tabular R output by taking advantage of Quarto formatting features. These functions allow one to produce different formats within one code chunk, e.g., a plot in the margin and a table in a collapsible note appearing after the code chunk. But if you need only one output format within a chunk you can make use of built-in syntax as described here. The yaml-like syntax also allows you to specify heights and widths for figures, plus multi-figure layouts.

Here is some example code with all the markup shown.

#| column: margin
#| fig-height: 1
#| fig-width: 3
par(mar=c(2, 2, 0, 0), mgp=c(2, .5, 0))
x <- rnorm(1000)
hist(x, nclass=40, main=’’)
x[1:3] # ordinary output stays put
knitr::kable(x[1:3]) # html output put in margin
hist(x, main=’’)

This results follow.

par(mar=c(2, 2, 0, 0), mgp=c(2, .5, 0))
x <- rnorm(1000)
hist(x, nclass=40, main='')

x[1:3]               # ordinary output stays put
[1] -0.6264538  0.1836433 -0.8356286
knitr::kable(x[1:3]) # html output put in margin
hist(x, main='')

Here are a few markups for figure layout inside R chunks.

Wide page (takes over the margins) and put multiple plots in 1 row:

#| column: screen-inset
#| layout-nrow: 1

When plotting 3 figures put the first 2 in one row and the third in the second row and make it wide.

#| layout: [[1,1], [1]]

Make the top left panel be wider than the top right one.

#| layout: [[70,30], [100]]

Top left and top right panels have equal widths but devote 0.1 of the total width to an empty region between the two top panels.

#| layout: [[45, -10, 45], [100]]

See here for details about figure specifications inside code chunks.

You can put some .aside information to the right of R output.

4.3 Quarto Report Writing Helper Functions

Helper functions are defined when you run the Hmisc function getRs to retrieve them from Github, i.e., getRs('reptools.r'). You can get help on these functions by running rsHelp(functionname). Several of the functions construct Quarto callouts which are fenced-off sections of markup that trigger special formatting, especially when producing html. The special formatting includes collapsible sections and marginal notes. Here is a summary of some of the reptools helper functions.

Function Purpose
dataChk run a series of logical expressions for checking data consistency, put results in separate tabs using maketabs, and optionally create two summary tabs
dataOverview runs a data overview report
missChk creates a series of analyses of the extent and patterns of missing values in a data table or data frame, and puts graphical summaries in tabs
hookaddcap makes knitr automatically extract figure labels, captions, short captions for use in list of figures
htmlList print a named list using the names as headers
kabl front-end to knitr::kable and kables. If you run kabl on more than one object it will automatically call kables.
makecallout generic Quarto callout maker used by makecnote, makecolmarg
makecnote print objects or run code and place output in an initially collapsed callout note
makecolmarg print objects or run code and place output in a marginal note
maketabs print objects or run code placing output in separate tabs
makemermaid makes a mermaid diagram with R variable values included in the diagram
makegraphviz similar to makemermaid but using graphviz
varType classify variables in a data table/frame or a vector as continuous, discrete, or non-numeric non-discrete
conVars use varType to extract list of continuous variables
disVars use varType to extract list of discrete variables
vClus run Hmisc::varclus on a dataset after reducing it

The input to maketabs, as will be demonstrated later, may be a named list, or more commonly, a series of formulas whose right-hand sides are executed and the result of each formula is placed in a separate tab. The left side of the formula becomes the tab label. For makecolmarg there should be no left side of the formula as marginal notes are not labeled. For the named list option the list names become the tab names. Examples of both approaches appear later in this report. In formulas, a left side label must be enclosed in back ticks and not quotes if it is a multi-word string. A wide argument is used to expand the width of the output outside the usual margins. An initblank argument creates a first tab that is empty. This allows one to show nothing until one of the other tabs is clicked. Alternately you can specify as the first formula ` ` ~ ` `.

See this for another way to generate tabs.

The two approaches to using maketabs also apply to makecnote and makecolmarg. Examples of the “print an object and place it inside a callout” are given later in the report for makecnote and makecolmarg. Here is an example of the more general formula method that can render any object, including html widgets as produced by plotly graphics. An interactive plotly graphic appears at the bottom of the plots in the right margin. You can single click on elements in the legend to turn them off and on, and double click within the legend to restore to default values.

options(plotlyauto=TRUE)  # makes Hmisc use plotly's auto size option
                          # rather than computing height, width
x <- round(rnorm(100, 100, 15))
makecolmarg(~ table(x) + raw + hist(x) + plot(ecdf(x)) + histboxp(x=x))
 67  70  73  77  78  79  81  82  83  84  86  87  88  89  90  91  92  93  94  95 
  1   1   1   1   1   1   2   1   1   1   1   1   1   3   1   6   1   3   3   2 
 96  98  99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 116 117 
  1   6   5   3   2   1   2   2   4   5   2   2   6   2   3   3   1   2   1   3 
118 120 121 122 123 124 130 133 136 
  2   1   1   1   1   2   1   1   1 

# or try makecnote(`makecnote example` ~ kabl(table(x)) + hist(x) + ...
# Avoid raw by using kabl(table(x)) instead of table(x)

Adding + raw to a formula in makecnote, makecolmarg, or maketabs forces printed results to be treated as raw verbatim R output.

makecallout is a general Quarto callout maker that implements different combinations of the following: list or formula, print or run code, defer executing and only produce the code to execute vs. running the code now, and close the callout or leave it open for more calls.

reptools also has helper functions for interactively accessing information to help in report and analysis building:

Function Purpose
htmlView view html-converted objects in RStudio View pane
htmlViewx view html-converted objects in external browser

4.4 Multi-Output Format Reports

To allow one report to be used to render multiple output formats, especially html and pdf, it is helpful to be able to sense which output format is currently in play, and to use different functions or options to render output explicitly for the current format. Here is how to create variables that can be referenced simply in code throughout the report, and to invoke the plotly graphics package if output is in html to allow interactivity. A small function ggp is defined so that if you run any ggplot2 output through it, the result will be automatically converted to plotly using the ggplotly function, otherwise it is left at standard static ggplot2 output if html is not the output target.

See this for examples of articles rendered in both html and PDF from the same script.
outfmt <- if(knitr::is_html_output ()) 'html'  else 'pdf'
markup <- if(knitr::is_latex_output()) 'latex' else 'html'
ishtml <- outfmt == 'html'
if(ishtml) require(plotly)
ggp <- if(ishtml) ggplotlyr else function(ggobject, ...) ggobject
# See below for more about ggplotlyr (a front end for ggplotly that can
# correct a formatting issue with hover text)

Quarto has a excellent facility for conditionally including document sections depending on the currently chosen output format.

The Hmisc, rms, and rmsb packages have a good deal of support for creating \(\LaTeX\) output in addition to html. They require some special \(\LaTeX\) packages to be accessed. In addition, if using any of Quarto’s nice features for making marginal notes, there is another \(\LaTeX\) package to attach. Below you’ll find what needs to be added to the yaml prologue at the top of your script if using Quarto. You have to modify pdf-engine to suit your needs. I use luatex because it handles special unicode characters. In the future (approximately July 2022) a bug in Pandoc will be fixed and you can put links-as-notes: true in the yaml header instead of redefining href and linking in hyperref.

    self-contained: true
    . . .
    pdf-engine: lualatex
    toc: false
    number-sections: true
    number-depth: 2
    top-level-division: section
    reference-location: document
    listings: false
      \usepackage{marginnote, here, relsize, needspace, setspace, hyperref}

The href redefinition above turns URLs into footnotes if running \(\LaTeX\).

There is one output element provided by Quarto that will not render correctly to \(\LaTeX\): a marginal note using the markup .column-margin. To automatically use an alternate in-body format, define a function that can be used for both typesetting formats.

mNote <- if(ishtml) '.column-margin'
                    '.callout-note appearance="minimal"'

Then use r mNote enclosed in back ticks in place of the .column-margin callout for generality.

Even when producing only html, one may wish to save individual graphics for manuscript writing. For non-interactive graphics you can right click on the image and download the .png file. For interactive plots, plotly shows a “take a snapshot” icon when you hover over the image. Clicking this icon will produce a static .png snapshot of the graph. Some graphs are not appropriate for static documents, and the variables created in the code above can be checked so that, for example, an alternative graph can be produced when making a .pdf file. But in other cases one just produces an additional static plot that is not shown in the html report. See the margin note near @fig-survplotp for an example.

As done with various Hmisc and rms package functions, one can capitalize on Hmisc’s special formatting of variable labels and units when constructing tables in \(\LaTeX\) or html. The basic constructs are shown in the code below.

# Retrieve a set of markup functions depending on typesetting format
# See below for definition of ishtml
specs    <- markupSpecs[[if(ishtml) 'html' else 'latex']]
# Hmisc markupSpecs functions create plain text, html, latex,
# markdown, or plotmath code
varlabel <- specs$varlabel  # retrieve an individual function
# Format text describing variable named x
# hfill=TRUE typesets units to be right-justified in label
# Use the following character string as a row label
# Default specifies the string to use if there is not label
# (usually taken as the variable name)
varlabel(label(x, default='x'), units(x), hfill=TRUE)

Note: As of 2022-12-11 quarto has withdrawn support for tooltips. I hope that is added back someday.

As exemplified in @sec-doverview, Mermaid provides an easy way to make many types of diagrams. Diagrams are more valuable when they are dynamic. Mermaid provides an easy way to include pop-up tooltips in diagram nodes, to provide deeper information about the node. When the tooltips contain tables whose columns need to line up, you need to put the following in your document so that tooltips will used a fixed-width font and preserve white space. The best way to include this is to put it in a .css file that is reference in the report’s yaml, or to surround the four lines with <style></style>.

mermaidTooltip {
      font-family: courier;
      white-space: pre;

4.5 HTML Tables

Nicely formatted tables can be created in multiple ways:

  • using customized code that directly writes html markup
  • using customized code that directly writes \(\LaTeX\) markup
  • using customized code that writes markdown markup (e.g., “pipe” tables)
  • hand coding markdown (usually pipe tables)

The latter two provide less flexibility but have the advantage of being automatically converted to html or \(\LaTeX\) depending on your destination format.

Here is an example of a hand coded markdown pipe table. Note (1) the second line of the markup indicates that the first column is to be left-justified and the second column right-justified, and (2) you can include computed values from R expressions.

| This Column | That Column |
| cat | dog |
| `r pi` | `r 2+3` |
: Table caption

The result is

Table caption
This Column That Column
cat dog
3.1415927 5

There is an automatic feature of html that makes it especially attractive as a destination format: If a cell contains a long string of characters, those strings will be line-wrapped appropriately, with the line length depending on the width of the display device.

The knitr package kable function provides an easy way to produce html tables from data tables/frames and matrices, and knitr::kables allows one to put several tables together. The reptools repository kabl function combines the features of kable and kables. The kableExtra package allows you to greatly extend what kable can do.

There are many R packages and functions for making advanced html tables. See for example the Table 1 tab in Chapter 9. This table was produced by the Hmisc package summaryM function, which used the htmlTable function in the htmlTable package. Other packages to consider are tangram and packages discussed here.

4.6 CSS

When producing reports in html, you can create custom html styles that quarto will use. These styles are defined using HTML5’s CSS (cascading style sheets). An example .css file is at hbiostat.org/rflow/h.css, and your report may gain access to such a .css file by including a line like css: h.css in the top-level quarto yaml header under the html: section.

Two of the styles defined by defined by h.css are smaller and smaller2. smaller will shrink the font size of a block of text (even one containing code and R output, but it does not apply to tables) to 80% of its original size. smaller2 will make it 64% of the original size. To invoke these styles we use quartodivs” as follows:

::: {.smaller2}
This is text that will appear smaller ...


Here is an example using smaller2.

This is text that will appear smaller. More of the same. More of the same. More of the same. More of the same. More of the same. More of the same. More of the same. More of the same. More of the same. More of the same. More of the same. More of the same. More of the same.

2.3 4.5
2.2 3.3
x <- pi
[1] 3.141593

Another style in h.css is quoteit which is useful for including quotations. The text is italicized, dark blue, 80% of regular size, and has 10% left and right margins. Here is an example.

::: {.quoteit}
Some eloquent quote appears here.  The author of the quote is assumed to know what they are talking about, and seem to be able to express themselves.

Some eloquent quote appears here. The author of the quote is assumed to know what they are talking about, and seem to be able to express themselves.

4.7 Diagrams

Quarto builds in two diagramming languages: mermaid and graphviz. Section 8.1 has detailed examples using mermaid, which uses a simpler format than graphviz. graphviz allows for more complex diagrams exemplified here and also provides more control. graphviz nodes can include HTML tables, and you can even have arrows drawn between table cells or between a table cells and other non-table nodes. Here is an example, taken from this excellent post. Connections between diagram elements are made possible by assigning port identifiers to elements.

The chunk header refers to dot which is a primary module of graphviz, for directed graphs.
digraph {
  graph [pad="0.5", nodesep="0.5", ranksep="2"]
  //  splines=ortho for square connections
  node  [shape=plain]

Foo [label=<
<table border="0" cellborder="0" cellspacing="0">
  <tr><td><b><i>InputFoo</i></b></td><td><font color="blue">two</font> </td>   </tr><HR/>
  <tr>  <td port="1">one</td><td> two </td></tr>
  <tr>  <td port="2">two</td><td> two </td></tr>
  <tr>  <td port="3">three</td><td> two </td></tr>
  <tr>  <td port="4">four</td><td> two </td></tr>
  <tr>  <td port="5">five</td><td port="a"> two </td></tr>
  <tr>  <td port="6">six</td><td port="b"> two </td></tr>
Bar [label=<This and that<br/><font face="courier" color="darkblue">and that and <b>that</b></font>>];

Foo:3:w -> Foo:2:w; 
// node name:port:direction (n,ne,e,se,s,sw,w,nw,c,_)
// c=center within node, _=use appropriate node side
// See graphviz.org/docs/attr-types/portPos
Foo:3:w -> Foo:6:w;
Foo:6:w -> Foo:1:w;
Foo:1:w -> Foo:a:e;
Foo:b:e -> Bar;
Foo InputFoo two one two two two three two four two five two six two Foo:w->Foo:w Foo:w->Foo:w Foo:w->Foo:w Foo:w->Foo:e Bar This and that and that and that Foo:e->Bar

The reptools makegraphviz function allows variable insertions into graphviz diagrams, and if a variable to be inserted is a data frame it will be converted to a simple HTML table that graphviz can handle. Here is an example. {{u}} is the syntax for inserting the value of variable u.

x <- data.frame(x1=round(runif(3), 3), x2=.q(a,b,c))
     x1 x2
1 0.268  a
2 0.219  b
3 0.517  c
z <- 'digraph {node [shape=plain];
  Foo [shape=oval label=<Information about <font color="blue">{{g}}</font>>];
  Bar [label=<{{u}}>];  // add shape=box to box the table
  Foo -> Bar}'
makegraphviz(z, g='states', u=x, file='gvtest.dot')

The diagram is then rendered with a dot chunk containing a special file: gvtest.dot markup.

Foo Information about states Bar x1 x2 0.268 a 0.219 b 0.517 c Foo->Bar

See Section 8.1 for a more advanced graphviz example that is along these lines. See this for some excellent graphviz flowchart examples.