Tag Archives: R

Simpler R coding with pipes > the present and future of the magrittr package

Background

It has only been 7 months and a bit since my initial magrittr commit to GitHub on January 1st. It has had more success than I had anticipated, and it appears that I was not quite alone with a frustration which caused me to start the magrittr project. I am not easily frustrated with R, but after a few weeks working with F# at work, I felt it upon returning to R: I had gotten used to writing code in a different way — all nicely aligned with thought and order of execution. The forward pipe operator |> was so addictive that being unable to do something similar in R was more than mildly irritating. Reversing thought, deciphering nested function calls, and making excessive use of temporary variables almost became deal breakers! Surprisingly, I had never really noticed this before, but once I did my returning to R became a difficult crossing.

An amazing thing about R is that it is a very flexible language and the problem could be solved. The |> operator in F# is indeed very simple: it is defined as let (|>) x f = f x. However, the usefulness of this simplicity relies heavily on a concept that is not available in Rpartial application. Furthermore, functions in F# almost always adhere to certain design principles which make the simple definition sufficient. Suppose that f is a function of two arguments, then in F# you may apply f to only the first argument and obtain a new function as the result — a function of the second argument alone. This is partial application, and works with any number of arguments, but application is always from left to right in the argument list. This is why the most important argument (and the one most likely to be a left-hand side object in the pipeline) is almost always the last argument, which in turn makes the simple definition of |> work. To illustrate, consider the following example:

some_value |> some_function other_value

Here, some_function is partially applied to other_value, creating a new function of a single argument, and by the simple definition of |>, this is applied to some_value.

It was clear to me that because R is lacking native partial application and conventions on argument order, no simple solution would be satisfactory, although definitely possible, see e.g. here or here. I wanted to make something that would feel natural in R, and which would serve the main purpose of improving cognitive performance of those writing the code, and of those reading the code.

It turned out that while I was working on magrittr’s %>% operator, Hadley Wickham and Romain Francois was implementing a similar %.% operator in their dplyr package which they announced on January 17. However, it was not quite as flexible, and we thought that piping functionality was better placed in its own more light-weight package. Hadley joined the magrittr project, and in dplyr 2.0 the %.% operator was deprecated — instead%>% was imported from magrittr.

Continue reading

R_logo

R 3.1.1 is released (and how to quickly update it on Windows OS)

R 3.1.1 (codename “Sock it to Me“) was released today! You can get the latest binaries version from here. (or the .tar.gz source code from here). The full list of new features and bug fixes is provided below.

Upgrading to R 3.1.1 on Windows

If you are using Windows you can easily upgrade to the latest version of R using the installr package. Simply run the following code:

# installing/loading the latest installr package:
install.packages("installr"); require(installr) #load / install+load installr
 
updateR()

After running “updateR()”, the function will detect that R is available for you, and will download+install it (etc.).

Note that the latest installr version (0.15.3) was released just less than a month ago to CRAN, and it is recommended to upgrade to it, since it has more updated URLs to some software.
I try to keep the installr package updated and useful, so if you have any suggestions or remarks on the package – you are invited to leave a comment below.

If you use the global library system (as I do), you can run the following in the new version of R:

source("http://www.r-statistics.com/wp-content/uploads/2010/04/upgrading-R-on-windows.r.txt")
New.R.RunMe()

CHANGES IN R 3.1.1:

David smith gave a nice summary of the features here. And here is also the full list:

NEW FEATURES

Continue reading

R 3.1.0 is released!

R 3.1.0 (codename “Spring Dance“) was released today!

hora jump
Photo credit: The Batsheva Dance Company in Ohad Naharin’s Hora. Photo by Gadi Dagon.

You can get the source code from
http://cran.r-project.org/src/base/R-3/R-3.1.0.tar.gz

or wait for it to be mirrored at a CRAN site nearer to you. Binaries for various platforms will appear in due course.

The full list of new features and bug fixes is provided below.

Upgrading to R 3.1.0

You can download the latest version from here.

If you are using Windows, it might take another 24 hours until you could update R. For convenience, you can upgrade to the latest version of R using the installr package. Simply run the following code:

# installing/loading the latest installr package:
install.packages("installr"); require(installr) #load / install+load installr
 
updateR()

After running “updateR()”, the function will detect that R is available for you, and will download+install it (etc.).

Note that the latest installr version (0.14.0) was released a week ago to CRAN, and it is recommended to upgrade to it, since it is now more robust for various extreme cases of upgrading R.
I try to keep the installr package updated and useful, so if you have any suggestions or remarks on the package – you are invited to leave a comment below.

If you use the global library system (as I do), you can run the following in the new version of R:

source("http://www.r-statistics.com/wp-content/uploads/2010/04/upgrading-R-on-windows.r.txt")
New.R.RunMe()

CHANGES IN R 3.1.0:

NEW FEATURES

Continue reading

R-users.com: invite fellow R-users to Jobs, conferences, and R-projects

Dear R users,

I am happy to officially announce a new website called R-users.com. The idea of the site is that community members will invite other R users to join them in their R projects, conferences, and work places.

R-users_homepage_screeshot

This site is a “job board” for R users, hosting various “call to action” to R-users, to do stuff such as:

  1. Join a open-source or paid projects of R programming
  2. Send/give a presentation for conferences (on R, statistics, machine learning, data science, etc.)
  3. Apply to be a student/researcher in an academic institution
  4. And other “R jobs”

For example, I am the author of the R package “installr” for easily updating R on windows. However, I would love for someone who is a mac/linux user to expend my package for non-Windows users. Hence, I created a new “job”, inviting help on this project, which you may see in this link.

If you also wish to post your own “R job” for other R-users to see, here is a very short presentation on how to do it:

The basic steps are:

  1. Register/login to the site (you can use your facebook/gmail account with just one click-registration)
  2. Fill in your proposed project/job details
  3. That’s it!

I intend to promote this site on r-bloggers.com, please help me in promoting this site on facebook and your own websites – so that more of us will be able to work together.

Yours,
Tal Galili

image007

Plotly Beta: Collaborative Plotting with R

(Guest post by Matt Sundquist on a lovely new service which is pro-actively supporting an API for R)

The Plotly R graphing library  allows you to create and share interactive, publication-quality plots in your browser. Plotly is also built for working together, and makes it easy to post graphs and data publicly with a URL or privately to collaborators.

In this post, we’ll demo Plotly, make three graphs, and explain sharing. As we’re quite new and still in our beta, your help, feedback, and suggestions go a long way and are appreciated. We’re especially grateful for Tal’s help and the chance to post.

Installing Plotly

Sign-up and Install (more in documentation)

From within the R console:

install.packages("devtools")
library("devtools")

Next, install plotly (a big thanks to Hadley, who suggested the GitHub route):

devtools::install_github("plotly/R-api")
# ...
# * DONE (plotly)

Then sign-up like this or at https://plot.ly/:

>library(plotly)
>response = signup (username = 'username', email= 'youremail')
…
Thanks for signing up to plotly! 
 
Your username is: MattSundquist
 
Your temporary password is: pw. You use this to log into your plotly account at https://plot.ly/plot. Your API key is: “API_Key”. You use this to access your plotly account through the API.
 
To get started, initialize a plotly object with your username and api_key, e.g. 
>>> p < - plotly(username="MattSundquist", key="API_Key")
Then, make a graph!
>>> res < - p$plotly(c(1,2,3), c(4,2,1))

And we’re up and running! You can change and access your password and key in your homepage.

1. Overlaid Histograms:

Here is our first script.

library("plotly")
p < - plotly(username="USERNAME", key="API_Key")
 
x0 = rnorm(500)
x1 = rnorm(500)+1
data0 = list(x=x0,
             type='histogramx',
opacity=0.8)
data1 = list(x=x1,
             type='histogramx',
opacity=0.8)
layout = list(barmode='overlay')  
 
response = p$plotly(data0, data1, kwargs=list(layout=layout)) 
 
browseURL(response$url)

The script makes a graph. Use the RStudio viewer or add “browseURL(response$url)” to your script to avoid copy and paste routines of your URL and open the graph directly.

image001

Continue reading

R 3.0.2 and RStudio 0.9.8 are released!

R 3.0.2 (codename “Frisbee Sailing”) was released yesterday. The full list of new features and bug fixes is provided below.

Also, RStudio v0.98 (in a “secret” preview) was announced two days ago with MANY new features, including:

Upgrading to R 3.0.2

You can download the latest version from here. Or, if you are using Windows, you can upgrade to the latest version using the installr package (also available on CRAN and github). Simply run the following code:

# installing/loading the package:
if(!require(installr)) { 
install.packages("installr"); require(installr)} #load / install+load installr
 
updateR(to_checkMD5sums = FALSE) # the use of to_checkMD5sums is because of a slight bug in the MD5 file on R 3.0.2. This issue is already resolved in the installr version on github, and will be released into CRAN in about a month from now..

I try to keep the installr package updated and useful. If you have any suggestions or remarks on the package, you’re invited to leave a comment below.

If you use the global library system (as I do), you can run the following in the new version of R:

source("http://www.r-statistics.com/wp-content/uploads/2010/04/upgrading-R-on-windows.r.txt")
New.R.RunMe()

p.s: you can also use the installr package to quickly install the new RStudio by using:

# installing/loading the package:
if(!require(installr)) { 
install.packages("installr"); require(installr)} #load / install+load installr
 
install.RStudio()

Continue reading

A speed test comparison of plyr, data.table, and dplyr

ssssssspeed_521872450_d085d1e928

Guest post by Jake Russ

For a recent project I needed to make a simple sum calculation on a rather large data frame (0.8 GB, 4+ million rows, and ~80,000 groups). As an avid user of Hadley Wickham’s packages, my first thought was to use plyr. However, the job took plyr roughly 13 hours to complete.

plyr is extremely efficient and user friendly for most problems, so it was clear to me that I was using it for something it wasn’t meant to do, but I didn’t know of any alternative screwdrivers to use.

I asked for some help on the manipulator Google group , and their feedback led me to data.table and dplyr, a new, and still in progress, package project by Hadley.

What follows is a speed comparison of these three packages incorporating all the feedback from the manipulator folks. They found it informative, so Tal asked me to write it up as a reproducible example.

Continue reading

brain_image01

Analyzing Your Data on the AWS Cloud (with R)

Guest post by Jonathan Rosenblatt

Disclaimer:
This post is not intended to be a comprehensive review, but more of a “getting started guide”. If I did not mention an important tool or package I apologize, and invite readers to contribute in the comments.

Introduction

I have recently had the delight to participate in a “Brain Hackathon” organized as part of the OHBM2013 conference. Being supported by Amazon, the hackathon participants were provided with Amazon credit in order to promote the analysis using Amazon’s Web Services (AWS). We badly needed this computing power, as we had 14*109 p-values to compute in order to localize genetic associations in the brain leading to Figure 1.

Figure 1- Brain volumes significantly associated to genotype.
brain_image01

While imaging genetics is an interesting research topic, and the hackathon was a great idea by itself, it is the AWS I wish to present in this post. Starting with the conclusion: 

Storing your data and analyzing it on the cloud, be it AWSAzureRackspace or others, is a quantum leap in analysis capabilities. I fell in love with my new cloud powers and I strongly recommend all statisticians and data scientists get friendly with these services. I will also note that if statisticians do not embrace these new-found powers, we should not be surprised if data analysis becomes synonymous with Machine Learning and not with Statistics (if you have no idea what I am talking about, read this excellent post by Larry Wasserman).

As motivation for analysis in the cloud consider:

  1. The ability to do your analysis from any device, be it a PC, tablet or even smartphone.
  2. The ability to instantaneously augment your CPU and memory to any imaginable configuration just by clicking a menu. Then scaling down to save costs once you are done.
  3. The ability to instantaneously switch between operating systems and system configurations.
  4. The ability to launch hundreds of machines creating your own cluster, parallelizing your massive job, and then shutting it down once done.

Here is a quick FAQ before going into the setup stages.

FAQ

Q: How does R fit in?

Continue reading