Printing nested tables in R – bridging between the {reshape} and {tables} packages

This post shows how to print a prettier nested pivot table, created using the {reshape} package (similar to what you would get with Microsoft Excel), so you could print it either in the R terminal or as a LaTeX table. This task is done by bridging between the cast_df object produced by the {reshape} package, […]

This post shows how to print a prettier nested pivot table, created using the {reshape} package (similar to what you would get with Microsoft Excel), so you could print it either in the R terminal or as a LaTeX table. This task is done by bridging between the cast_df object produced by the {reshape} package, and the tabular function introduced by the new {tables} package.

Here is an example of the type of output we wish to produce in the R terminal:

1
2
3
4
5
6
7
       ozone       solar.r        wind         temp
 month mean  sd    mean    sd     mean   sd    mean  sd
 5     23.62 22.22 181.3   115.08 11.623 3.531 65.55 6.855
 6     29.44 18.21 190.2    92.88 10.267 3.769 79.10 6.599
 7     59.12 31.64 216.5    80.57  8.942 3.036 83.90 4.316
 8     59.96 39.68 171.9    76.83  8.794 3.226 83.97 6.585
 9     31.45 24.14 167.4    79.12 10.180 3.461 76.90 8.356

Or in a latex document:

Motivation: creating pretty nested tables

In a recent post we learned how to use the {reshape} package (by Hadley Wickham) in order to aggregate and reshape data (in R) using the melt and cast functions.

The cast function is wonderful but it has one problem – the format of the output. As opposed to a pivot table in (for example) MS excel, the output of a nested table created by cast is very “flat”. That is, there is only one row for the header, and only one column for the row names. So for both the R terminal, or an Sweave document, when we deal with a more complex reshaping/aggregating, the result is not something you would be proud to send to a journal.

The opportunity: the {tables} package

The good news is that Duncan Murdoch have recently released a new package to CRAN called {tables}. The {tables} package can compute and display complex tables of summary statistics and turn them into nice looking tables in Sweave (LaTeX) documents. For using the full power of this package, you are invited to read through its detailed (and well written) 23 pages Vignette. However, some of us might have preferred to keep using the syntax of the {reshape} package, while also benefiting from the great formatting that is offered by the new {tables} package. For this purpose, I devised a function that bridges between cast_df (from {reshape}) and the tabular function (from {tables}).

The bridge: between the {tables} and the {reshape} packages

The code for the function is available on my github (link: tabular.cast_df.r on github) and it seems to works fine as far as I can see (though I wouldn’t run it on larger data files since it relies on melting a cast_df object.)

Here is an example for how to load and use the function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
######################
# Loading the functions
######################
# Making sure we can source code from github
source("https://www.r-statistics.com/wp-content/uploads/2012/01/source_https.r.txt")
 
# Reading in the function for using tabular on a cast_df object:
source_https("https://raw.github.com/talgalili/R-code-snippets/master/tabular.cast_df.r")
 
 
 
######################
# example:
######################
 
############
# Loading and preparing some data
require(reshape)
names(airquality) <- tolower(names(airquality))
airquality2 <- airquality
airquality2$temp2 <- ifelse(airquality2$temp > median(airquality2$temp), "hot", "cold")
aqm <- melt(airquality2, id=c("month", "day","temp2"), na.rm=TRUE)
colnames(aqm)[4] <- "variable2"	# because otherwise the function is having problem when relying on the melt function of the cast object
head(aqm,3)
#  month day temp2 variable2 value
#1     5   1  cold     ozone    41
#2     5   2  cold     ozone    36
#3     5   3  cold     ozone    12
 
############
# Running the example:
tabular.cast_df(cast(aqm, month ~ variable2, c(mean,sd)))
tabular(cast(aqm, month ~ variable2, c(mean,sd))) # notice how we turned tabular to be an S3 method that can deal with a cast_df object
Hmisc::latex(tabular(cast(aqm, month ~ variable2, c(mean,sd)))) # this is what we would have used for an Sweave document

And here are the results in the terminal:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
>
> tabular.cast_df(cast(aqm, month ~ variable2, c(mean,sd)))
 
       ozone       solar.r        wind         temp
 month mean  sd    mean    sd     mean   sd    mean  sd
 5     23.62 22.22 181.3   115.08 11.623 3.531 65.55 6.855
 6     29.44 18.21 190.2    92.88 10.267 3.769 79.10 6.599
 7     59.12 31.64 216.5    80.57  8.942 3.036 83.90 4.316
 8     59.96 39.68 171.9    76.83  8.794 3.226 83.97 6.585
 9     31.45 24.14 167.4    79.12 10.180 3.461 76.90 8.356
> tabular(cast(aqm, month ~ variable2, c(mean,sd))) # notice how we turned tabular to be an S3 method that can deal with a cast_df object
 
       ozone       solar.r        wind         temp
 month mean  sd    mean    sd     mean   sd    mean  sd
 5     23.62 22.22 181.3   115.08 11.623 3.531 65.55 6.855
 6     29.44 18.21 190.2    92.88 10.267 3.769 79.10 6.599
 7     59.12 31.64 216.5    80.57  8.942 3.036 83.90 4.316
 8     59.96 39.68 171.9    76.83  8.794 3.226 83.97 6.585
 9     31.45 24.14 167.4    79.12 10.180 3.461 76.90 8.356

And in an Sweave document:

Here is an example for the Rnw file that produces the above table:
cast_df to tabular.Rnw

I will finish with saying that the tabular function offers more flexibility then the one offered by the function I provided. If you find any bugs or have suggestions of improvement, you are invited to leave a comment here or inside the code on github.

(Link-tip goes to Tony Breyal for putting together a solution for sourcing r code from github.)

Exporting R output to MS-Word with R2wd (an example session)

UPDATE (2014-11-02): please note that this post is from 2010. These days, it is much simpler to create docx files from R using knitr+pandoc. Using pander (links: [1], [2]) can also help make the markdown output look nicer in the file.

Creating reports is one of the basic tasks in data analysis. R provides numerous functions and packages to export it’s (beautiful) output and help compile it into a report.

In this post I will present one such (basic) solution for Windows OS users for exporting R output into Microsoft Word using the R2wd (package). There are more ways and strategies for doing this, and if encouraged by comments, I will gladly write more on the subject.
* * *

R to Word using {R2wd}

The package R2wd (available through CRAN) relies on rcom. It is a wrapper that uses the statconnDCOM server to communicate with MS-Word via the COM interface.

R2wd can perform the basic tasks you would expect to need when creating a report from R. It allows you to:

  • Create a new Word file
  • Create headers and sub-headers
  • Move to a new pages in the document
  • Write text
  • Insert tables (that is “data.frame” and “matrix”objects)
  • Insert plots
  • Save and close the Word document
  • …(and more)

The current R2wd can still be seen as being in BETA stages.  Some features are not yet available, such as:

  • Choosing text font (which means most of us will need to manually change the font in the document to “couriers new…”, in order for the formatting to look good)
  • Inserting of complex object outputs (such as summery.lm, although in the example bellow I show how that can be achieved using a simple function)
  • Speed – the speed of inserting a table is somewhat slow, I am not sure how it would scale to large documents

But from a (pleasant) correspondence with the package developer, I was assured the next release will supply us with more options and features.

R2wd package developer, Christan Ritter, invites feedback from users.  So if you have features you are missing in this packages, I believe he would like to know about it (you can e-mail Christan at:     christian.ritter <-at-> ridaco <-dot-> be  )

Getting R2wd 1.3

The current version of R2wd is 1.1 and Christan Ritter (the package developer), says it is a “first idea” and that a more elaborate version will soon (e.g: around July) be available on CRAN.   In the meantime, Christan was so kind as to send me a more recent version of the package, which you (until it gets uploaded to CRAN), you are welcome to download from here:
R2wd 1.3 download link

How to use R2wd to create a report – a sample session

Being young doesn’t prevent from R2wd to do some nice things.

Here is the text from the library(help=R2wd) :

If Word is not already running, wdGet() opens a new Word document, otherwise, it establishes a COM handle to the instance which is already running. The functions wdTitle, wdHeader, wdBody, and wdParagraph can be used to inject text elements into Word. Moreover, bookmarks can be added via wdInsertBookmarks and wdGoToBookmark allows to navigate among the bookmarks which also exist. There is another set of convenience functions, wdSection, wdSubsection, and wdSubsubsection which insert headers of level 1, 2, or 3, start new ’Sections’ in Word, and add bookmarks.
Graphs and dataframes can be inserted intoWord, by the wdPlot, wdTable commands. The wdTable command takes a dataframe or an array as arguments, creates a Word table of the appropriate dimensions and injects the content of the dataframe or array into it. It then formats the table in Word using elementary formating elements.
The functions wdApplyTheme and wdApplyTemplate allow to work with themes and templates.

Here is an example sessions to demonstrate some of what is said:

# install.packages("R2wd")
# library(help=R2wd)
require(R2wd)


wdGet(T)	# If no word file is open, it will start a new one - can set if to have the file visiable or not
wdNewDoc("c:\This.doc")	# this creates a new file with "this.doc" name

wdApplyTemplate("c:\This.dot")	# this applies a template


wdTitle("Examples of R2wd (a package to write Word documents from R)")	# adds a title to the file

wdSection("Example 1 - adding text", newpage = T) # This can also create a header

wdHeading(level = 2, "Header 2")
wdBody("This is the first example we will show")
wdBody("(Notice how, by using two different lines in wdBody, we got two different paragraphs)")
wdBody("(Notice how I can use this: ' n' (without the space), to  n  go to the next
		line)")
wdBody("האם זה עובד בעברית ?")
wdBody("It doesn't work with Hebrew...")
wdBody("O.k, let's move to the next page (and the next example)")

wdSection("Example 2 - adding tables", newpage = T)
wdBody("Table using 'format'")
wdTable(format(head(mtcars)))
wdBody("Table without using 'format'")
wdTable(head(mtcars))


wdSection("Example 3 - adding lm summary", newpage = T)

## Example from  ?lm
ctl

Update:
Upon reading my post, Chris suggested that I’ll also add a note here about SWORD, a tool written by Thomas Baier (the creator of the StatconnDCOM server) which allows to include R-code in a Sweave-like fashion in Word documents. Here is a link to the project: http://rcom.univie.ac.at