Screen Display of S Objects

Screen Display of S Objects

Richard M. Heiberger1   Frank E. Harrell, Jr.2






Abstract:

We describe a set of S functions that will print an S object in its own window on the user's workstation. We discuss design issues in constructing the functions and relating the individual members of the set to each other and to the object-oriented paradigm in the underlying S program. We give several applications.

S, Display software, Object-oriented programming




1   Introduction

A data object in S, a data.frame is our typical example, often has structure that was imposed on the raw data file. When we analyze the data it is helpful to have an image of the data visible in a separate window from the one in which the S session is itself running. Graphical images are displayed in their own window. We provide a set of functions to display text images similarly.

For example, assume we are working with a data.frame constructed from an input text file. We have assigned row.names, labeled factors, and otherwise placed logical structure on the input flat file. As we go through levels of analysis, we wish to view the data simultaneously with developing the appropriate analysis. For the illustration, we assume we are working with an X-window workstation. We tell S which display device we are using with the options(window="X") statement at the beginning of the S session:

> #X
> options(window="X")
> # X defaults to the xedit program for its display.
We then use the command
> print.display(my.data.frame)
to send a text image of my.data.frame to the display.

There are circumstances when we do not wish to see the entire set of columns within the data.frame, or in which we wish to display a rearranged subset of the columns. In these cases, we say
> print.display(my.data.frame[,c(2,3,10:12,6)])
and a new window will appear with just the desired information.

2   Function Design

The function print.display has been designed as a method for objects of class=="display". Any object for which inherits(object,"display") == T is automatically printed with the print.display function. Any other object can be forced to print on the display by explicitly using print.display. The function takes additional arguments of two types. First, it takes general arguments (width= and length=) to prevent folding of long lines, and otherwise take advantage of scroll bars in the displayed window. Second, it takes device-specific arguments that allow user control of fonts and/or pagination on the display device (X.flags=, lpr.flags=, lp.flags=, pr.flags=). We have provided specific functions in the page.* family of function names for 15 different display devices (window systems, screen editors, typesetters, image display programs, printers). It is easy for a user with a different software preference or hardware availability to add another similar function.

The initial impetus was the Vars function (Harrell and Heiberger 1993) which collected supplementary information about data.frames (class, factor levels, formats, variable labels) and displayed it in a window. The motivation for the present paper was the recognition that Vars() was a combination of two separable functions. First, it queried a data.frame and constructed a summary of the supplementary information. Second, it displayed its results in a window on the display screen. Here, we focus on the design of the display function and use the initial application as an example.

The primary user-level function is print.display. It takes any S object for an argument, prints it to a temporary file (using the S function sink), and then sends it to an output device using our generic function page. The new generic function page3 determines the value of options()$window, say it finds "X", and then forwards the temporary file and any additional arguments to the function page.X. The function page.X uses the arguments and any additional options and then constructs and executes a Unix command for the display. Users will need to set the option(window="X"), but will otherwise not generally work directly with the page.* functions.

The default behavior of print.display is identical to options(window="tty"): the display is at the terminal.

With options(window="emacs"), the pager uses emacsclient and opens another buffer within the current emacs session.

With options(window="X"), there are several options for pagers. The default is "xedit", which comes with the X distribution. Individual workstation manufacturers often distribute their own editors. One publicly available X-window pager is xless, available for download from several ftp sites. Any of these can be used instead. Additional arguments or options allow control of fonts, geometry, and color.

The display construct generalizes to include printers. We provide definitions of window="lpr", window="lp", and window="prlp" with flags to allow setting of Unix pr(1) options.

3   Applications

3.1   Information About a Data Frame

The motivating application was the display of summary information about a data frame. We illustrate with a small example:
> my.data.frame <-  data.frame(x=1:2, y=factor(c("a","b")),
+       q=structure(3:4, label="Z"))
> my.data.frame
  x y q 
1 1 a 3
2 2 b 4
> Vars(my.data.frame)
  Label  Class Levels 
q     Z              
x                    
y       factor    a b
The function Vars returns an object of class "display", therefore the result of the Vars function goes directly to the display (in this example, the terminal) since it was not assigned to another object. An alternate usage
> V.my <- Vars(my.data.frame)
> V.my
saves the summary information in the S object V.my. Typing the name V.my calls the implicit print routine, which in turn recognizes that V.my is of class "display" and sends it to the display.

3.2   Contents of a Data Frame

We often wish to view the contents of a data frame while we are constructing or interpreting an analysis. For example, say we have a data frame with 26 variables, and we are currently studying a model based on columns 11:15. The statement
> options(window="Xsgi")
> print.display(cars93[,c(11:15,1:10,16:26)],
+ X.flags="-font Courier8", width=280, title="cars93")
displays the reordered columns. This example shows several other characteristics.

The width has been set so each row of the data.frame appears on one row of the output file. The font has been set very small, in an (unsuccessful) attempt to get most of the columns on the screen simultaneously. The scroll bars of the editor are needed to move around in the window.

The title argument was required in this example. The default title that appears in the title bar of the window is the name of the S object being displayed. In this example, the S object is cars93[,c(11:15,1:10,16:26)], an expression that includes punctuation characters that are interpreted by the Unix shell. The title argument replaces the default title with one that will not have such difficulties. The user can place shell characters in the title argument, but must remember to escape them with the
character.

3.3   Display of Related File

The next example takes advantage of the page.* functions, the ones normally hidden from the user. We have a data.frame constructed by entering data collected by means of a multi-page data collection form. The form is stored on the computer system in a set of files, one per page. There are occasions while studying the data when we wish to see the image of the data collection form.

We construct a variable form.page in our .Data directory that records the page number in the form from which each variable was taken. For example,
> form.page <- structure(
+     c(1,1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3),
+     names=names(my.data.frame),
+     form.dir="Study.R93.124")
We provide a function that knows how to retrieve the page given a question number
> form <- function(question=1, form.page.arg=form.page,
+         sep="/", ...)
+     invisible(page(file=
+     paste(attr(form.page.arg,"form.dir"),
+       form.page.arg[question], sep=sep),
+       remove.file=F, ... ))
Now when we wish to examine a particular page of the form, based on the current view of the data, we can just call it up. Here, we assume the form is stored in ascii text:
> form(12)            # view the page with question 12
> form("cholesterol") # view the page with
>                     # the "cholesterol" question
This example also illustrates several new characteristics. Each display device and pager referenced by one of the page.* functions has different behavior. Some read the temporary file and then remove it on completion, others leave it alone by default. Some work in the background, others in the foreground by default. Some are slow enough starting up, that were we to remove the temporary file ourselves, the file would be gone before the pager got to it. The remove.file flag is our answer to the difficulty. It is set to a reasonable value for each pager as part of the initialization in the page.* functions. When the pager is used to display an already existing file, we must assure ourselves that the file is not destroyed. Therefore we explicitly override the setting with the remove.file=F argument.

In this example we assume that a given S directory will have only one study in it, and by convention, that its variable linking question numbers and pages is called form.page. These assumptions can be overridden. An alternate convention, that many studies are stored in the same directory, can be accomodated by defining the form.dir attribute to be "Study/R93.124" and using the argument sep="." in the form function. The variable linking the question numbers and pages can be specified in the call to the form function. When the form function is called without specifying a question number, we default to the first question.

This is the simplest version of the form function. Some elaborations are obvious. Typeset questionnaires could be displayed on the screen by taking advantage of the ... argument and changing the pager. The typeset questionnaires can be stored TeX dvi files, or PostScript files, or scanned files printed by some other technology. Examples using these are
> form("cholesterol",page="xdvi")      # TeX dvi file
> form("cholesterol",page="ghostview") # PostScript file
> form("cholesterol",page="xli")       # Scanned image
Another type of elaboration is to attach the form.page variable as an attribute to the data frame and adjust the definition of the form function accordingly.

4   Availability

The programs (Heiberger and Harrell, 1993) are publicly available for download from the the statlib archive at statlib@lib.stat.cmu.edu.

Acknowledgments

This cooperative work was made possible by the availability of the internet.

References

Harrell, Frank E., Jr., and Richard M. Heiberger (1993). ``Display of supplementary information from data.frames,'' programs are available from statlib@lib.stat.cmu.edu. Send e-mail: ``send vars from S".

Heiberger, Richard M, and Frank E. Harrell, Jr. (1993). ``S Functions for Screen Display of S Objects,'' programs are available from statlib@lib.stat.cmu.edu. Send e-mail: ``send print.display from S".

1
Professor, Department of Statistics, Temple University, Philadelphia, PA 19122-2585. E-mail: rmh@astro.ocis.temple.edu

©199x American Statistical Association, Institute of Mathematical Statistics,
©199x and Interface Foundation of North America
Volume x, Number x, pp. xx--xx
3
S-Plus (StatSci, Inc.) users have a built-in function page which we suggest be renamed to page.StatSci.