2010 | R-statistics blog

Helping the blind use R – by exporting R console to Word

Update (2016-01-30): This post is quite old (from 2010), these days it should be easier to have your R output readable by using the knitr package. It allows you to take an R script file and create an HTML output from it using the stitch_rhtml function.

You should also read the article by Jonathan R. Godfrey: Statistical Software from a Blind Person’s Perspective. And have a look at his BrailleR R package.

Preface – R seems a natural fit for the blind statistician

For blind people who wish to do statistics, R can be ideal. R command line interface offers straight forward statistical scripting in the form of question (what is the mean of x) followed by an answer (0.2). That is, instead of point-and-click dialog boxes with jumping windows of results that GUI statistical systems offer.

But there are still more hurdles to face before R can offer a perfect solution to the blind.
In this post I would like to address just one such problem – reading R console output.

Directing R console output to word – to allow blind people to easily navigate in it

Recently, a question was posed in the R-help mailing list by a guy names Faiz, a blind new user of R. Faiz wants to direct R output into word, to allow him to be able to read it. Here is what he wrote:

I would like to read the results of the commands type in the terminal window in Microsoft Word. As a blind user my options are somewhat limited and are time consuming if I want to see the results of the commands that I have type earlier. for example if my first two commands were
x<-c(1,2,3,4,5)
mean(x)
and I have typed ten more commands after the first two commands it is not easy for me to see that what was the result of mean(x)
but if I can somehow divert the results of the commands to Microsoft Word it is comparatively easy for me to see what was the result of mean(x) and what were the results of other commands. One another advantage of diverting R’s output to Microsoft Word for me is that from there they can be easily copied into assignments as well.

Faiz later elaborated more on his issue:

I am using Windows XP, and using a screen reader called JAWS. When I type something at the console, I hear once what I have typed, and then the focus is on the next line. Then if I press the up arrow key I get to hear the function I just typed, not its output. For example if I type mean(x) and then I press enter I will hear “[5]” if it is the mean of x. Then I will hear “>”. Now if I want to find out what was the mean of x by pressing the
up arrow key, I will only hear mean(x) and I will not hear [5].
My screen reader does provide options to use different cursors to read command lines.
but if I have typed median(x) sd(x) var(x) length(x) after typing mean(x), it takes a long time before I can move my cursor to the location where I can hear the mean of x. If the results of the commands can be diverted to MS Word it becomes comparatively easy for me to quickly move forward and backward in the document.
Any ideas and suggestions are appreciated.

Since recently I reviewed how one could export R output to MS-Word with R2wd, It was only fitting to try and implement R2wd for this problem.
I went looking on how to direct R console into a txt file, so I could later dump it into word. I found that two commands gave me half of what I wanted. sink() allows me to direct R output to a txt file, and savehistory() can save the command history into a txt file. But I needed something that combines the two and captures all of R console output into a file.
Failing to locate one, I turned to the R mailing list. Among the kind people trying to help (Thank you David Winsemius, Bert Gunter and Duncan Murdoch) Greg Snow came through in supplying the help (not surprisingly…).
Greg directed me to a function he wrote called txtStart() (from the TeachingDemos package), which operates in a similar way as sink(), only it also captures the R commands that where used – exactly what I was looking for!

Based on this, I devised two functions that can be used to redirect R output into word.

Here is how to use them:

# Step 1: reading the functions needed for this task, from the file I uploaded to www.r-statistics.com
source("https://www.r-statistics.com/wp-content/uploads/2010/05/R-console-to-word.r.txt")
# Example:
# Step 2 - start capturing
txtStart.2wd()	# start capturing text.  If you are missing any packages - this function will prompt you to install them
				# IF the installation fails - consider changing your mirror location
# Step 3 - run R code
	date()
	x

For me, this worked…

If you would like R to automatically run in the startup the code needed to get the two functions: txtStart.2wd and txtStop.2wd , you can run this in your R console: (once is enough)

# Start of code
Rprofile.site.loc

Bringing R to the blind: there is much more work a head!

Until this point, it didn’t cross my mind to ask how can R be used by the blind. But once this question was raised – it brings with it many more questions.
Can R be adjusted to easily be read by known aids to sight impaired people? (I am sure Linux users here will have much to add)
Can people in the community think of writing function to turn R output into a more easily read text for the blind?
For example – the summary() command is wonderful for me. But I am trying to imagine how it would look like in the “eyes” of a person who can’t see. Surly there could be some way to turn the wide summary format into a long format.
Perhaps there is room for a more general approach to the question of how to help blind people to be able to use R.
And is there a need? How many blind people choose to pursue studying statistics (or disciplines for which they would need to know statistics/R)?
I hope to read your thoughts on the matter.

On a personal note: My father was on the verge of blindness, prior to his cataract surgery. I saw first hand how the life of the sight-impaired can look like. Giving people in that situation help is a great MITZVA (a.k.a: “good deed” in Hebrew).

useR-2010 is looking for a T-shirt design

Katharine Mullen has just published on the R mailing list a call for designeRs who might be willing to design a T-shirt aRt design for the shirt that will be given in useR 2010.

I consider such contests as one of those good-for-the-community things, and hope regular useRs, R bloggers, and companies that are based on R – will consider spreading the word, participating in it (and maybe even offer more bonuses to the designers).

If you design something and put it on picasa or flickr, please tag it with “useR2010Tshirt” (and consider leaving a comment with a link to the design), so there could later be a follow up on your work. Even if you don’t “win” you will get positive “karma points” from the community 🙂 .

Here are the competition details, as published in the mailing list:
Continue reading “useR-2010 is looking for a T-shirt design”

Exporting R output to MS-Word with R2wd (an example session)

UPDATE (2014-11-02): please note that this post is from 2010. These days, it is much simpler to create docx files from R using knitr+pandoc. Using pander (links: [1], [2]) can also help make the markdown output look nicer in the file.

Creating reports is one of the basic tasks in data analysis. R provides numerous functions and packages to export it’s (beautiful) output and help compile it into a report.

In this post I will present one such (basic) solution for Windows OS users for exporting R output into Microsoft Word using the R2wd (package). There are more ways and strategies for doing this, and if encouraged by comments, I will gladly write more on the subject.
* * *

R to Word using {R2wd}

The package R2wd (available through CRAN) relies on rcom. It is a wrapper that uses the statconnDCOM server to communicate with MS-Word via the COM interface.

R2wd can perform the basic tasks you would expect to need when creating a report from R. It allows you to:

Create a new Word file
Create headers and sub-headers
Move to a new pages in the document
Write text
Insert tables (that is “data.frame” and “matrix”objects)
Insert plots
Save and close the Word document
…(and more)

The current R2wd can still be seen as being in BETA stages. Some features are not yet available, such as:

Choosing text font (which means most of us will need to manually change the font in the document to “couriers new…”, in order for the formatting to look good)
Inserting of complex object outputs (such as summery.lm, although in the example bellow I show how that can be achieved using a simple function)
Speed – the speed of inserting a table is somewhat slow, I am not sure how it would scale to large documents

But from a (pleasant) correspondence with the package developer, I was assured the next release will supply us with more options and features.

R2wd package developer, Christan Ritter, invites feedback from users. So if you have features you are missing in this packages, I believe he would like to know about it (you can e-mail Christan at: christian.ritter <-at-> ridaco <-dot-> be )

Getting R2wd 1.3

The current version of R2wd is 1.1 and Christan Ritter (the package developer), says it is a “first idea” and that a more elaborate version will soon (e.g: around July) be available on CRAN. In the meantime, Christan was so kind as to send me a more recent version of the package, which you (until it gets uploaded to CRAN), you are welcome to download from here:
R2wd 1.3 download link

How to use R2wd to create a report – a sample session

Being young doesn’t prevent from R2wd to do some nice things.

Here is the text from the library(help=R2wd) :

If Word is not already running, wdGet() opens a new Word document, otherwise, it establishes a COM handle to the instance which is already running. The functions wdTitle, wdHeader, wdBody, and wdParagraph can be used to inject text elements into Word. Moreover, bookmarks can be added via wdInsertBookmarks and wdGoToBookmark allows to navigate among the bookmarks which also exist. There is another set of convenience functions, wdSection, wdSubsection, and wdSubsubsection which insert headers of level 1, 2, or 3, start new ’Sections’ in Word, and add bookmarks.
Graphs and dataframes can be inserted intoWord, by the wdPlot, wdTable commands. The wdTable command takes a dataframe or an array as arguments, creates a Word table of the appropriate dimensions and injects the content of the dataframe or array into it. It then formats the table in Word using elementary formating elements.
The functions wdApplyTheme and wdApplyTemplate allow to work with themes and templates.

Here is an example sessions to demonstrate some of what is said:

# install.packages("R2wd")
# library(help=R2wd)
require(R2wd)


wdGet(T)	# If no word file is open, it will start a new one - can set if to have the file visiable or not
wdNewDoc("c:\This.doc")	# this creates a new file with "this.doc" name

wdApplyTemplate("c:\This.dot")	# this applies a template


wdTitle("Examples of R2wd (a package to write Word documents from R)")	# adds a title to the file

wdSection("Example 1 - adding text", newpage = T) # This can also create a header

wdHeading(level = 2, "Header 2")
wdBody("This is the first example we will show")
wdBody("(Notice how, by using two different lines in wdBody, we got two different paragraphs)")
wdBody("(Notice how I can use this: ' n' (without the space), to  n  go to the next
		line)")
wdBody("האם זה עובד בעברית ?")
wdBody("It doesn't work with Hebrew...")
wdBody("O.k, let's move to the next page (and the next example)")

wdSection("Example 2 - adding tables", newpage = T)
wdBody("Table using 'format'")
wdTable(format(head(mtcars)))
wdBody("Table without using 'format'")
wdTable(head(mtcars))


wdSection("Example 3 - adding lm summary", newpage = T)

## Example from  ?lm
ctl

Update:
Upon reading my post, Chris suggested that I’ll also add a note here about SWORD, a tool written by Thomas Baier (the creator of the StatconnDCOM server) which allows to include R-code in a Sweave-like fashion in Word documents. Here is a link to the project: http://rcom.univie.ac.at

The new GUI for ggplot2 (using Deducer) – the designer wants your opinion

After discovering that R is expected (this summer) to have a GUI for ggplot2 (through deducer), I later found Ian’s gsoc proposal for this GUI. Since the system is in it’s early stages of development, Ian has invited people to give comments, input and critique on his plans for the project.

For your convenience (and with Ian’s permission), I am reposting his proposal here. You are welcome to send him feedback by e-mailing him (at: [email protected]), or by leaving a comment here (and I will direct him to your comment).

Continue reading “The new GUI for ggplot2 (using Deducer) – the designer wants your opinion”

R is going to have a GUI to ggplot2! (by the end of this years google-summer-of-code)

I was delighted to see the following ~~e-mail~~ post from Dirk Eddelbuettel regarding the google-summer-of-code R google group:
* * *

Earlier today Google finalised student / mentor pairings and allocations for
the Google Summer of Code 2010 (GSoC 2010). The R Project is happy to
announce that the following students have been accepted:

Colin Rundel, “rgeos – an R wrapper for GEOS”, mentored by Roger Bivand of
the Norges Handelshoyskole, Norway

Ian Fellows, “A GUI for Graphics using ggplot2 and Deducer”, mentored by
Hadley Wickham of Rice University, USA

Chidambaram Annamalai, “rdx – Automatic Differentiation in R”, mentored by
John Nash of University of Ottawa, Canada

Yasuhisa Yoshida, “NoSQL interface for R”, mentored by Dirk Eddelbuettel,
Chicago, USA

Felix Schoenbrodt, “Social Relations Analyses in R”, mentored by Stefan
Schmukle, Universitaet Muenster, Germany

Details about all proposals are on the R Wiki page for the GSoC 2010 at
http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010

The R Project is honoured to have received its highest number of student
allocations yet, and looks forward to an exciting Summer of Code. Please
join me in welcoming our new students.

At this time, I would also like to thank all the other students who have
applied for working with R in this Summer of Code. With a limited number of
available slots, not all proposals can be accepted — but I hope that those
not lucky enough to have been granted a slot will continue to work with R and
towards making contributions within the R world.

I would also like to express my thanks to all other mentors who provided for
a record number of proposals. Without mentors and their project ideas we
would not have a Summer of Code — so hopefully we will see you again next
year.

Regards,

Dirk (acting as R/GSoC 2010 admin)

* * *

From all the projects, the one I am most excited about is:
Ian Fellows, “A GUI for Graphics using ggplot2 and Deducer”, mentored by Hadley Wickham of Rice University, USA

Deducer (text from the website) attempts to be a free easy to use alternative to proprietary data analysis software such as SPSS, JMP, and Minitab. It has a menu system to do common data manipulation and analysis tasks, and an excel-like spreadsheet in which to view and edit data frames. The goal of the project is to two-fold.

Provide an intuitive interface so that non-technical users can learn and perform analyses without programming getting in their way.
Increase the efficiency of expert R users when performing common tasks by replacing hundreds of keystrokes with a few mouse clicks. Also, as much as possible the GUI should not get in their way if they just want to do some programming.

Deducer is designed to be used with the Java based R console JGR, though it supports a number of other R environments (e.g. Windows RGUI and RTerm).

This combination (of Deducer and ggplot2) might finally provide the bridge to the layman-statistician that some people recently wrote to be one of R’s weak spots (while other bloogers wrote back that this is o.k., still no one refuted that R doesn’t compete with the point-and-click of softwares like SPSS or JMP.)
I came across Ian in the discussion forums, where he provided very kind help to his package “deducer”. Coupled with having Hadley as his mentor, I am very optimistic about the prospects of seeing this project reaching very high standards.
Very exciting development indeed!

Update: Ian’s proposal is available to view here.

p.s: for some intuition about how a GUI for ggplot2 can look like, have a look at this video of Jeroen Ooms’s ggplot2 web interface

How to upgrade R on windows XP – another strategy (and the R code to do it)

Update: This post has a follow-up for how to upgrade R on windows 7 explaining how to deal with permission issues.

Background – how I heard that there is more then one way to upgrade R

If you didn’t hear it by now – R 2.11.0 is out with a bunch of new features.

After Andrew Gelman recently lamented the lack of an easy upgrade process for R, a Stackoverflow thread (by JD Long) invited R users to share their strategies for easily upgrading R.

Upgrading strategy – moving to a global R library

In that thread, Dirk Eddelbuettel suggested another idea for upgrading R. His idea is of using a folder for R’s packages which is outside the standard directory tree of the installation (a different strategy then the one offered on the R FAQ).

The idea of this upgrading strategy is to save us steps in upgrading. So when you wish to upgrade R, instead of doing the following three steps:

download new R and install
copy the “library” content from the old R to the new R
upgrade all of the packages (in the library folder) to the new version of R.

You could instead just have steps 1 and 3, and skip step 2 (thus, saving us time…).

For example, under windows XP, you might have R installed on:
C:Program FilesRR-2.11.0
But (in this alternative model for upgrading) you will have your packages library on a “global library folder” (global in the sense of independent of a specific R version):
C:Program FilesRlibrary

So in order to use this strategy, you will need to do the following steps (all of them are performed in an R code provided later in the post)-

In the OLD R installation (in the first time you move to the new system of managing the upgrade):
1. Create a new global library folder (if it doesn’t exist)
2. Copy to the new “global library folder” all of your packages from the old R installation
3. After you move to this system – the steps 1 and 2 would not need to be repeated. (hence the advantage)
In the NEW R installation:
1. Create a new global library folder (if it doesn’t exist – in case this is your first R installation)
2. Premenantly point to the Global library folder whenever R starts
3. (Optional) Delete from the “Global library folder” all the packages that already exist in the local library folder of the new R install (no need to have doubles)
4. Update all packages. (notice that you picked a mirror where the packages are up-to-date, you sometimes need to choose another mirror)

Thanks to help from Dirk, David Winsemius and Uwe Ligges, I was able to write the following R code to perform all the tasks I described 🙂

So first you will need to run the following code:
Continue reading “How to upgrade R on windows XP – another strategy (and the R code to do it)”

The difference between "letters[c(1,NA)]" and "letters[c(NA,NA)]"

In David Smith’s latest blog post (which, in a sense, is a continued response to the latest public attack on R), there was a comment by Barry that caught my eye. Barry wrote:

Even I get caught out on R quirks after 20 years of using it. Compare letters[c(12,NA)] and letters[c(NA,NA)] for the most recent thing that made me bang my head against the wall.

So I did, and here’s the output:

> letters[c(12,NA)]
[1] "l" NA
>  letters[c(NA,NA)]
 [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
>

Interesting isn’t it?
I had no clue why this had happened but luckily for us, Barry gave a follow-up reply with an explanation. And here is what he wrote:
Continue reading “The difference between "letters[c(1,NA)]" and "letters[c(NA,NA)]"”

Parallel Multicore Processing with R (on Windows)

Parallel Processing backend for R under windows – installation tips and some examples.

This post offers simple example and installation tips for “doSMP” the ~~new~~ Parallel Processing backend package for R under windows.
* * *

Update:
The required packages are ~~not yet~~ now available on CRAN, but until they will get online, you can download them from here:
REvolution foreach windows bundle
(Simply unzip the folders inside your R library folder)

* * *

Recently, REvolution blog announced the release of “doSMP”, an R package which offers support for symmetric multicore processing (SMP) on Windows.
This means you can now speed up loops in R code running iterations in parallel on a multi-core or multi-processor machine, thus offering windows users what was until recently available for only Linux/Mac users through the doMC package.

Installation

For now, doSMP is not available on CRAN, so in order to get it you will need to download the REvolution R distribution “R Community 3.2” (they will ask you to supply your e-mail, but I trust REvolution won’t do anything too bad with it…)
If you already have R installed, and want to keep using it (and not the REvolution distribution, as was the case with me), you can navigate to the library folder inside the REvolution distribution it, and copy all the folders (package folders) from there to the library folder in your own R installation.

If you are using R 2.11.0, you will also need to download (and install) the revoIPC package from here:
revoIPC package – download link (required for running doSMP on windows)
(Thanks to Tao Shi for making this available!)

Usage

Once you got the folders in place, you can then load the packages and do something like this:

require(doSMP)
workers <- startWorkers(2) # My computer has 2 cores
registerDoSMP(workers)

# create a function to run in each itteration of the loop
check <-function(n) {
	for(i in 1:1000)
	{
		sme <- matrix(rnorm(100), 10,10)
		solve(sme)
	}
}


times <- 10	# times to run the loop

# comparing the running time for each loop
system.time(x <- foreach(j=1:times ) %dopar% check(j))  #  2.56 seconds  (notice that the first run would be slower, because of R's lazy loading)
system.time(for(j in 1:times ) x <- check(j))  #  4.82 seconds

# stop workers
stopWorkers(workers)

Points to notice:

You will only benefit from the parallelism if the body of the loop is performing time-consuming operations. Otherwise, R serial loops will be faster
Notice that on the first run, the foreach loop could be slow because of R's lazy loading of functions.
I am using startWorkers(2) because my computer has two cores, if your computer has more (for example 4) use more.
Lastly - if you want more examples on usage, look at the "ParallelR Lite User's Guide", included with REvolution R Community 3.2 installation in the "doc" folder

Updates

(15.5.10) :
~~The new R version (2.11.0) doesn't work with doSMP, and will return you with the following error:~~

Loading required package: revoIPC
Error: package 'revoIPC' was built for i386-pc-intel32

So far, a solution is not found, except using REvolution R distribution, or using R 2.10
A thread on the subject was started recently to report the problem. Updates will be given in case someone would come up with better solutions.

Thanks to Tao Shi, there is now a solution to the problem. You'll need to download the revoIPC package from here:
revoIPC package - download link (required for running doSMP on windows)
Install the package on your R distribution, and follow all of the other steps detailed earlier in this post. It will now work fine on R 2.11.0

Update 2: Notice that I added, in the beginning of the post, a download link to all the packages required for running parallel foreach with R 2.11.0 on windows. (That is until they will be uploaded to CRAN)

Update 3 (04.03.2011): doSMP is now officially on CRAN!

An article attacking R gets responses from the R blogosphere – some reflections

In this post I reflect on the current state of the R blogosphere, and share my hopes for the future

In this post I reflect on the current state of the R blogosphere, and share my hopes for it’s future.

* * *

Background

I am very grateful to Dr. AnnMaria De Mars for writing her post “The Next Big Thing”.
In her post, Dr. De Mars attacked R by accusing it of being “an epic fail” (in being user-friendly) and “NOT the next big thing”. Of course one should look at Dr. De Mars claims in their context. She is talking about particular aspects in which R fails (the lacking of a mature GUI for non-statisticians), and had her own (very legitimate) take on where to look for “the next big thing”. All in all, her post was decent, and worth contemplating upon respectfully (even if one, me for example, doesn’t agree with all of Dr. De Mars claims.)

R bloggers are becoming a community

But Dr. De Mars post is (very) important for a different reason. Not because her claims are true or false, but because her writing angered people who love and care for R (whether legitimately or not, it doesn’t matter). Anger, being a very powerful emotion, can reveal interesting things. In our case, it just showed that R bloggers are connected to each other.

So far there are 69 R bloggers who wrote in reply to Dr. De Mars post (some more kind then others), they are:

The inevitable R backlash by Matt Blackwell.
R is an Epic Fail? by Yihui Xie
The Next Big Thing: SAS and SPSS!…wait, what? By Drew Conway (Extra hat tip for also linking to other R bloggers who wrote about this)
R Is (Not) the Next Big Thing by Joe Dunn
A reply in Spanish
“The next big thing”, R, and Statistics in the cloud by Tal Galili (me)
R Command Line by Lee
R is an epic fail or is it just overhyped by Ajay Ohri
Statisticians and programming; languages and GUIs by Mike Smith

R and the Next Big Thing by David Smith

This is good news, since it shows that R has a community of people (not “just people”) who write about it.
In one of the posts, someone commented about how R current stage reminds him of how linux was in 1998, and how he believes R will grow to be amazingly dominant in the next 10 years.
In the same way, I feel the R blogosphere is just now starting to “wake up” and become aware that it exists. Already 6 bloggers found they can write not just about R code, but also reply to does who “attack” R (in their view). Imagine how the R blogosphere might look in a few years from now…

I would like to end with a more general note about the importance of R bloggers collaboration to the R ecosystem.

Continue reading “An article attacking R gets responses from the R blogosphere – some reflections”

"The next big thing", R, and Statistics in the cloud

A friend just e-mailed me about a blog post by Dr. AnnMaria De Mars titled “The Next Big Thing”.

In it Dr. De Mars wrote (I allowed myself to emphasize some parts of the text):

Contrary to what some people seem to think, R is definitely not the next big thing, either. I am always surprised when people ask me why I think that, because to my mind it is obvious. […]
for me personally and for most users, both individual and organizational, the much greater cost of software is the time it takes to install it, maintain it, learn it and document it. On that, R is an epic fail. It does NOT fit with the way the vast majority of people in the world use computers. The vast majority of people are NOT programmers. They are used to looking at things and clicking on things.

Here are my two cents on the subject:
Continue reading “"The next big thing", R, and Statistics in the cloud”