The R Journal, Vol.2 Issue 2 is out

The second issue of the second volume of The R Journal is now available .

Download complete issue

Refereed articles may be downloaded individually using the links below. [Bibliography of refereed articles]

Table of Contents

Editorial3

Contributed Research Articles

Solving Differential Equations in R
Karline Soetaert, Thomas Petzoldt and R. Woodrow Setzer
5
Source References
Duncan Murdoch
16
hglm: A Package for Fitting Hierarchical Generalized Linear Models
Lars Rönnegård, Xia Shen and Moudud Alam
20
dclone: Data Cloning in R
Péter Sólymos
29
stringr: modern, consistent string processing
Hadley Wickham
38
Bayesian Estimation of the GARCH(1,1) Model with Student-t Innovations
David Ardia and Lennart F. Hoogerheide
41
cudaBayesreg: Bayesian Computation in CUDA
Adelino Ferreira da Silva
48
binGroup: A Package for Group Testing
Christopher R. Bilder, Boan Zhang, Frank Schaarschmidt and Joshua M. Tebbs
56
The RecordLinkage Package: Detecting Errors in Data
Murat Sariyar and Andreas Borg
61
spikeslab: Prediction and Variable Selection Using Spike and Slab Regression
Hemant Ishwaran, Udaya B. Kogalur and J. Sunil Rao
68

From the Core

What’s New?74

News and Notes

useR! 201077
Forthcoming Events: useR! 201179
Changes in R81
Changes on CRAN90
News from the Bioconductor Project101
R Foundation News102

New edition of “R Companion to Applied Regression” – by John Fox and Sandy Weisberg

Just two hours ago, Professor John Fox has announced on the R-help mailing list of a new (second) edition to his book “An R and S Plus Companion to Applied Regression”, now title . “An R Companion to Applied Regression, Second Edition”.

John Fox is (very) well known in the R community for many contributions to R, including the car package (which any one who is interested in performing SS type II and III repeated measures anova in R, is sure to come by), the Rcmdr pacakge (one of the two major GUI’s for R, the second one is Deducer), sem (for Structural Equation Models) and more.  These might explain why I think having him release a new edition for his book to be big news for the R community of users.

In this new edition, Professor Fox has teamed with Professor Sandy Weisberg, to refresh the original edition so to cover the development gained in the (nearly) 10 years since the first edition was written.

Here is what John Fox had to say:

Dear all,

Sandy Weisberg and I would like to announce the publication of the second
edition of An R Companion to Applied Regression (Sage, 2011).

As is immediately clear, the book now has two authors and S-PLUS is gone
from the title (and the book). The R Companion has also been thoroughly
rewritten, covering developments in the nearly 10 years since the first
edition was written and expanding coverage of topics such as R graphics and
R programming. As before, however, the R Companion provides a general
introduction to R in the context of applied regression analysis, broadly
construed. It is available from the publisher at (US) or (UK), and from Amazon (see here)

The book is augmented by a web site with data sets, appendices on a variety of topics, and more, and it associated with the car package on CRAN, which has recently undergone an overhaul.

Regards,
John and Sandy

Continue reading

R GUI now offers interactive graphics – Deducer 0.4-2 connects with iplots

Earlier today, Ian Fwllows has announced the release of Deducer 0.4-2 and DeducerExtras 1.2 to CRAN (I copy his announcement here):

Deducer 0.4-2 contains a few bug fixes, and an interface to the iplots package. With the new iplots interface it is now possible to do interactive plots with Deducer. An introductory example screen cast (by Ian) is available on the tube:

DeducerExtras 1.2 contains a few new dialogs including ‘load data from package’, and ‘t-test power’.

Additionally, a new Windows R/JGR/Deducer installer is available which installs R-2.12.0, JGR with it’s launcher, Deducer, DeducerExtras, and DeducerPlugInScaling. It is available on the Deducer website:

http://www.deducer.org/pmwiki/pmwiki.php?n=Main.WindowsInstallation

WP-CodeBox: A better R syntax highlighter plugin for WordPress

Today I was informed of (what I believe is) a better the best WordPress plugin for R syntax highlighting called WP-CodeBox.  This plugin doesn’t require any hacks to make it work (as opposed to the WP-Syntax plugin, which I wrote about in the past).  WP-CodeBox can be downloaded and installed on a WordPress by searching for it in the “Add New” section in the plugins menu.

WP-CodeBox provides some nice features (some AJAX based) to the display of the code in the post:

  1. The code box in the post can now be folded (top right of the code box) so the code can be hidden so to not clutter the post (if the code is too long)
  2. The code box is added with another button  (top left of the code box) which allows the reader to see the code in a new window – so to easily enable a copy paste of the code.
  3. The options of the plugin allows automatic row numbering of the code, control over “tab” length and some other features.

p.s: Lastly, my thanks goes to guangchuang yu who’s comment on my original post, and he’s post on wp-codebox and R, has introduced me to this better plugin.

p.p.s: in case you blog on WordPress.com, there is also a solution for R syntax highlighting for WordPress.com bloggers.

A competition to recommend “relevant” R packages – and the future of R

Update: the competition was just launched.
* * *

What is the competition about?

Drew Conway and John Myles Whyte have collected data from (52) R users about the packages they have installed. The data is now available on github for download and the contest will be run on the kaggle platform.

For more details, head over to dataists.

And for fun, here is the dependency graph for R packages they have assembled so far:

A graphical visualization of packages’ “suggestion” relationships. Affectionately referred to as the R Flying Spaghetti Monster. More info below.

A tiny bit more on R bloggers virality

Continue reading

A new version of ff released (version 2.2.0)

A few hours ago, Jens Oehlschlägel has announced on the R-help mailing list of the release of a new version of the ff package.

The ff package provides data structures that are stored on disk but behave (almost) as if they were in RAM by transparently mapping only a section (pagesize) in main memory – the effective virtual memory consumption per ff object.

Here are the new features of ff, as Jens wrote in his announcement:

—-
Dear R community,

The next release of package ff is available on CRAN. With kind help of Brian Ripley it now supports the Win64 and Sun versions of R. It has three major functional enhancements:

a) new fast in-memory sorting and ordering functions (single-threaded)
b) ff now supports on-disk sorting and ordering of ff vectors and ffdf dataframes
c) ff integer vectors now can be used as subscripts of ff vectors and ffdf dataframes

a) is achieved by careful implementation of NA-handling and exploiting context information
b) although permanently stored, sorting and ordering of ff objects can be faster than the standard routines in R
c) applying an order to ff vectors and ffdf dataframes is substantially slower than in pure R because it involves disk-access AND sorting index positions (to avoid random access).

There is still room for improvement, however, the current status should already be useful. I run some comparisons with SAS (see end of mail):
- both could sort German census size (81e6 rows) on a 3GB notebook
- ff sorts and orders faster on single columns
- sorting big multicolumn-tables is faster in SAS

Continue reading

Managing a statistical analysis project – guidelines and best practices

In the past two years, a growing community of R users (and statisticians in general) have been participating in two major Question-and-Answer websites:

  1. The R tag page on Stackoverflow, and
  2. Stat over flow (which will soon move to a new domain, no worries, I’ll write about it once it happens)

In that time, several long (and fascinating) discussion threads where started, reflecting on tips and best practices for managing a statistical analysis project.  They are:

On the last thread in the list, the user chl, has started with trying to compile all the tips and suggestions together.  And with his permission, I am now republishing it here.  I encourage you to contribute from your own experience (either in the comments, or by answering to any of the threads I’ve linked to)

Continue reading

R syntax highlighting for bloggers on WordPress.com

Good news for R bloggers who are using WordPress.com to host their blog.

This week, the good people running WordPress.com (special thanks goes to Yoav Farhi), have added the ability for all the users of the WordPress.com platform to be able to highlight their R code inside posts.

Basically you’ll need to wrap the code in your post like this:

[sourcecode language="r"]
test.function = function(r) {
    return(pi * r^2)
}
test.function(1)
[/sourcecode]

(Which will then look like this:
r syntax highlighted code example
)

Further details (and other supported languages) can be read about on this WordPress.com support page.

This new feature was possible thanks to the work of Yihui Xie (who create the famous cool animation package for R), who created a R syntax brush for the syntaxhighlighter WordPress plugin (the plugin used by WordPress.com for sytnax highlighting) . And thanks should also go to Andrew Redd, the creator of NppToR (which connects between notepad++ to R). He both made some good suggestions, and was game to take on the brush creation in case there would be problems, which thankfully so far there aren’t any)

p.s: If you are a WordPress.org users (e.g: have a self hosted WordPress blog) and want to enable R syntax highlighting for your blog, I would recommend the use of the WP-Syntax plugin (enhanced with GeSHi version 1.0.8.6) which can be downloaded here.