Tag Archives: statistics

Generation of E-Learning Exams in R for Moodle, OLAT, etc.

(Guest post by Achim Zeileis)
Development of the R package exams for automatic generation of (statistical) exams in R started in 2006 and version 1 was published in JSS by Grün and Zeileis (2009). It was based on standalone Sweave exercises, that can be combined into exams, and then rendered into different kinds of PDF output (exams, solutions, self-study materials, etc.). Now, a major revision of the package has been released that extends the capabilities and adds support for learning management systems. It is still based on the same type of
Sweave files for each exercise but can also render them into output formats like HTML (with various options for displaying mathematical content) and XML specifications for online exams in learning management systems such as Moodle or OLAT. Supplementary files such as graphics or data are
handled automatically. Here, I give a brief overview of the new capabilities. A detailed discussion is in the working paper by Zeileis, Umlauf, and Leisch (2012) that is also contained in the package as a vignette.
Continue reading Generation of E-Learning Exams in R for Moodle, OLAT, etc.

Diagram for a Bernoulli process (using R)

A Bernoulli process is a sequence of Bernoulli trials (the realization of n binary random variables), taking two values (0/1, Heads/Tails, Boy/Girl, etc…). It is often used in teaching introductory probability/statistics classes about the binomial distribution.

When visualizing a Bernoulli process, it is common to use a binary tree diagram in order to show the progression of the process, as well as the various consequences of the trial. We might also include the number of “successes”, and the probability for reaching a specific terminal node.

I wanted to be able to create such a diagram using R. For this purpose I composed some code which uses the {diagram} R package. The final function should allow one to create different sizes of diagrams, while allowing flexibility with regards to the text which is used in the tree.

Here is an example of the simplest use of the function:

source("http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game_.r.txt") # loading the function
binary.tree.for.binomial.game(2) # creating a tree for B(2,0.5)

The resulting diagram will look like this:

The same can be done for creating larger trees. For example, here is the code for a 4 stage Bernoulli process:

source("http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game_.r.txt") # loading the function
binary.tree.for.binomial.game(4) # creating a tree for B(4,0.5)

The resulting diagram will look like this:

The function can also be tweaked in order to describe a more specific story. For example, the following code describes a 3 stage Bernoulli process where an unfair coin is tossed 3 times (with probability of it giving heads being 0.8):

source("http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game_.r.txt") # loading the function
 
binary.tree.for.binomial.game(3, 0.8, first_box_text = c("Tossing an unfair coin", "(3 times)"), left_branch_text = c("Failure", "Playing again"), right_branch_text = c("Success", "Playing again"),
    left_leaf_text = c("Failure", "Game ends"), right_leaf_text = c("Success",
        "Game ends"), cex = 0.8, rescale_radx = 1.2, rescale_rady = 1.2,
    box_color = "lightgrey", shadow_color = "darkgrey", left_arrow_text = c("Tails n(P = 0.2)"),
    right_arrow_text = c("Heads n(P = 0.8)"), distance_from_arrow = 0.04)

The resulting diagram is:

If you make up neat examples of using the code (or happen to find a bug), or for any other reason – you are welcome to leave a comment.

(note: the images above are licensed under CC BY-SA)

Article about plyr published in JSS, and the citation was added to the new plyr (version 1.5)

The plyr package (by Hadley Wickham) is one of the few R packages for which I can claim to have used for all of my statistical projects. So whenever a new version of plyr comes out I tend to be excited about it (as was when version 1.2 came out with support for parallel processing)

So it is no surprise that the new release of plyr 1.5 got me curious. While going through the news file with the new features and bug fixes, I noticed how (quietly) Hadley has also released (6 days ago) another version of plyr prior to 1.5 which was numbered 1.4.1. That version included only one more function, but a very important one – a new citation reference for when using the plyr package. Here is how to use it:

install.packages("plyr") # so to upgrade to the latest release
citation("plyr")

The output gives both a simple text version as well as a BibTeX entry for LaTeX users. Here it is (notice the download link for yourself to read):

To cite plyr in publications use:
Hadley Wickham (2011). The Split-Apply-Combine Strategy for Data
Analysis. Journal of Statistical Software, 40(1), 1-29. URL
http://www.jstatsoft.org/v40/i01/.

I hope to see more R contributers and users will make use of the ?citation() function in the future.

Book review: 25 Recipes for Getting Started with R

Recently I was asked by O’Reilly publishing to give a book review for Paul Teetor new introductory book to R.  After giving the book some attention and appreciating it’s delivery of the material, I was happy to write and post this review.  Also, I’m very happy to see how a major publishing house like O’Reilly is producing more and more R books, great news indeed.

And now for the book review:

Executive summary: a book that offers a well designed gentle introduction for people with some background in statistics wishing to learn how to get common (basic) tasks done with R.

Information

By: Paul Teetor
Publisher:O’Reilly
MediaReleased: January 2011
Pages: 58 (est.)

Format

The book “25 Recipes for Getting Started with R” offers an interesting take on how to bring R to the general (statistically oriented) public.

Continue reading Book review: 25 Recipes for Getting Started with R

The R Journal, Vol.2 Issue 2 is out

The second issue of the second volume of The R Journal is now available .

Download complete issue

Refereed articles may be downloaded individually using the links below. [Bibliography of refereed articles]

Table of Contents

Editorial 3

Contributed Research Articles

Solving Differential Equations in R
Karline Soetaert, Thomas Petzoldt and R. Woodrow Setzer
5
Source References
Duncan Murdoch
16
hglm: A Package for Fitting Hierarchical Generalized Linear Models
Lars Rönnegård, Xia Shen and Moudud Alam
20
dclone: Data Cloning in R
Péter Sólymos
29
stringr: modern, consistent string processing
Hadley Wickham
38
Bayesian Estimation of the GARCH(1,1) Model with Student-t Innovations
David Ardia and Lennart F. Hoogerheide
41
cudaBayesreg: Bayesian Computation in CUDA
Adelino Ferreira da Silva
48
binGroup: A Package for Group Testing
Christopher R. Bilder, Boan Zhang, Frank Schaarschmidt and Joshua M. Tebbs
56
The RecordLinkage Package: Detecting Errors in Data
Murat Sariyar and Andreas Borg
61
spikeslab: Prediction and Variable Selection Using Spike and Slab Regression
Hemant Ishwaran, Udaya B. Kogalur and J. Sunil Rao
68

From the Core

What’s New? 74

News and Notes

useR! 2010 77
Forthcoming Events: useR! 2011 79
Changes in R 81
Changes on CRAN 90
News from the Bioconductor Project 101
R Foundation News 102

New edition of “R Companion to Applied Regression” – by John Fox and Sandy Weisberg

Just two hours ago, Professor John Fox has announced on the R-help mailing list of a new (second) edition to his book “An R and S Plus Companion to Applied Regression”, now title . “An R Companion to Applied Regression, Second Edition”.

John Fox is (very) well known in the R community for many contributions to R, including the car package (which any one who is interested in performing SS type II and III repeated measures anova in R, is sure to come by), the Rcmdr pacakge (one of the two major GUI’s for R, the second one is Deducer), sem (for Structural Equation Models) and more.  These might explain why I think having him release a new edition for his book to be big news for the R community of users.

In this new edition, Professor Fox has teamed with Professor Sandy Weisberg, to refresh the original edition so to cover the development gained in the (nearly) 10 years since the first edition was written.

Here is what John Fox had to say:

Dear all,

Sandy Weisberg and I would like to announce the publication of the second
edition of An R Companion to Applied Regression (Sage, 2011).

As is immediately clear, the book now has two authors and S-PLUS is gone
from the title (and the book). The R Companion has also been thoroughly
rewritten, covering developments in the nearly 10 years since the first
edition was written and expanding coverage of topics such as R graphics and
R programming. As before, however, the R Companion provides a general
introduction to R in the context of applied regression analysis, broadly
construed. It is available from the publisher at (US) or (UK), and from Amazon (see here)

The book is augmented by a web site with data sets, appendices on a variety of topics, and more, and it associated with the car package on CRAN, which has recently undergone an overhaul.

Regards,
John and Sandy

Continue reading New edition of “R Companion to Applied Regression” – by John Fox and Sandy Weisberg

Tips for the R beginner (a 5 page overview)

In this post I publish a PDF document titled “A collection of tips for R in Finance”.
It is a basic 5 page introduction to R in finances by Arnaud Amsellem (linked in profile).

The article offers tips related to the following points:

  • Code Editor
  • Organizing R code
  • Update packages
  • Getting external data into R
  • Communicating with external applications
  • Optimizing R code

This article is well articulated, and offers a perspective of someone who is experienced in the field and touches points that I can imagine beginners might otherwise overlook. I hope publishing it here will be of use to some readers out there.

Update: as some readers have noted to me (by e-mail, and by commenting), this document touches very lightly on the topic of “finances” in R. I therefore decided to update the title from “R in finance – some tips for beginners”, to it’s current form.

Lastly: if you (a reader of this blog) feel you have an article (“post”) to contribute, but don’t feel like starting your own blog, feel welcome to contact me, and I’ll be glad to post what you have to say on my blog (and subsequently, also on R bloggers).

Here is the article:
Continue reading Tips for the R beginner (a 5 page overview)

ggplot2 plot builder is now on CRAN! (through Deducer 0.4 GUI for R)

Ian fellows, a hard working contributer to the R community (and a cool guy), has announced today the release of Deducer (0.4) to CRAN (scheduled to update in the next day or so).
This major update also includes the release of a new plug-in package (DeducerExtras), containing additional dialogs and functionality.

Following is the e-mail he sent out with all the details and demo videos.

Continue reading ggplot2 plot builder is now on CRAN! (through Deducer 0.4 GUI for R)