Posts Tagged ‘tutorial’

data.frame objects in R (via “R in Action”)

Posted in R on December 18th, 2011 by Tal Galili – 4 Comments

The followings introductory post is intended for new users of R.  It deals with R data frames: what they are, and how to create, view, and update them.

This is a guest article by Dr. Robert I. Kabacoff, the founder of (one of) the first online R tutorials websites: Quick-R.  Kabacoff has recently published the book ”R in Action“, providing a detailed walk-through for the R language based on various examples for illustrating R’s features (data manipulation, statistical methods, graphics, and so on…)

For readers of this blog, there is a 38% discount off the “R in Action” book (as well as all other eBooks, pBooks and MEAPs at Manning publishing house), simply by using the code rblogg38 when reaching checkout.

Let us now talk about data frames:
read more »

Book review: 25 Recipes for Getting Started with R

Posted in R, statistics on February 24th, 2011 by Tal Galili – 1 Comment

Recently I was asked by O’Reilly publishing to give a book review for Paul Teetor new introductory book to R.  After giving the book some attention and appreciating it’s delivery of the material, I was happy to write and post this review.  Also, I’m very happy to see how a major publishing house like O’Reilly is producing more and more R books, great news indeed.

And now for the book review:

Executive summary: a book that offers a well designed gentle introduction for people with some background in statistics wishing to learn how to get common (basic) tasks done with R.

Information

By: Paul Teetor
Publisher:O’Reilly
MediaReleased: January 2011
Pages: 58 (est.)

Format

The book “25 Recipes for Getting Started with R” offers an interesting take on how to bring R to the general (statistically oriented) public.

read more »

Tips for the R beginner (a 5 page overview)

Posted in R, statistics on August 23rd, 2010 by Tal Galili – 4 Comments

In this post I publish a PDF document titled “A collection of tips for R in Finance”.
It is a basic 5 page introduction to R in finances by Arnaud Amsellem (linked in profile).

The article offers tips related to the following points:

  • Code Editor
  • Organizing R code
  • Update packages
  • Getting external data into R
  • Communicating with external applications
  • Optimizing R code

This article is well articulated, and offers a perspective of someone who is experienced in the field and touches points that I can imagine beginners might otherwise overlook. I hope publishing it here will be of use to some readers out there.

Update: as some readers have noted to me (by e-mail, and by commenting), this document touches very lightly on the topic of “finances” in R. I therefore decided to update the title from “R in finance – some tips for beginners”, to it’s current form.

Lastly: if you (a reader of this blog) feel you have an article (“post”) to contribute, but don’t feel like starting your own blog, feel welcome to contact me, and I’ll be glad to post what you have to say on my blog (and subsequently, also on R bloggers).

Here is the article:
read more »

Rose plot using Deducers ggplot2 plot builder

Posted in R, visualization on August 16th, 2010 by Tal Galili – Be the first to comment

The (excellent!) LearnR blog had a post today about making a rose plot in
ggplot2.

Following today’s announcement, by Ian Fellows, regarding the release of the new version of Deducer (0.4) offering a strong support for ggplot2 using a GUI plot builder, Ian also sent an e-mail where he shows how to create a rose plot using the new ggplot2 GUI included in the latest version of Deducer. After the template is made, the plot can be generated with 4 clicks of the mouse.

Here is a video tutorial (Ian published) to show how this can be used:

The generated template file is available at:
http://neolab.stat.ucla.edu/cranstats/rose.ggtmpl

I am excited about the work Ian is doing, and hope to see more people publish use cases with Deducer.

ggplot2 plot builder is now on CRAN! (through Deducer 0.4 GUI for R)

Posted in R, statistics, visualization on August 16th, 2010 by Tal Galili – 9 Comments

Ian fellows, a hard working contributer to the R community (and a cool guy), has announced today the release of Deducer (0.4) to CRAN (scheduled to update in the next day or so).
This major update also includes the release of a new plug-in package (DeducerExtras), containing additional dialogs and functionality.

Following is the e-mail he sent out with all the details and demo videos.

read more »

Repeated measures ANOVA with R (functions and tutorials)

Posted in R, R links, statistics on April 13th, 2010 by Tal Galili – 10 Comments

Repeated measures ANOVA is a common task for the data analyst.

There are (at least) two ways of performing “repeated measures ANOVA” using R but none is really trivial, and each way has it’s own complication/pitfalls (explanation/solution to which I was usually able to find through searching in the R-help mailing list).

So for future reference, I am starting this page to document links I find to tutorials, explanations (and troubleshooting) of “repeated measure ANOVA” done with R

Functions and packages

(I suggest using the tutorials supplied bellow for how to use these functions)

  • aov {stats} – offers SS type I repeated measures anova, by a call to lm for each stratum. A short example is given in the ?aov help file
  • Anova {car} – Calculates type-II or type-III analysis-of-variance tables for model objects produced by lm, and for various other object. The ?Anova help file offers an example for how to use this for repeated measures
  • ezANOVA {ez} – This function provides easy analysis of data from factorial experiments, including purely within-Ss designs (a.k.a. “repeated measures”), purely between-Ss designs, and mixed within-and-between-Ss designs, yielding ANOVA results and assumption checks. It is a wrapper of the Anova {car} function, and is easier to use. The ez package also offers the functions ezPlot and ezStats to give plot and statistics of the ANOVA analysis. The ?ezANOVA help file gives a good demonstration for the functions use (My thanks goes to Matthew Finkbe for letting me know about this cool package)
  • friedman.test {stats} – Performs a Friedman rank sum test with unreplicated blocked data. That is, a non-parametric one-way repeated measures anova. I also wrote a wrapper function to perform and plot a post-hoc analysis on the friedman test results
  • Non parametric multi way repeated measures anova – I believe such a function could be developed based on the Proportional Odds Model, maybe using the {repolr} or the {ordinal} packages. But I still didn’t come across any function that implements these models (if you do – please let me know in the comments).
  • Repeated measures, non-parametric, multivariate analysis of variance – as far as I know, such a method is not currently available in R.  There is, however, the Analysis of similarities (ANOSIM) analysis which provides a way to test statistically whether there is a significantdifference between two or more groups of sampling units.  Is is available in the {vegan} package through the “anosim” function.  There is also a tutorial and a relevant published paper.

Good Tutorials

Troubelshooting

Unbalanced design
Unbalanced design doesn’t work when doing repeated measures ANOVA with aov, it just doesn’t. This situation occurs if there are missing values in the data or that the data is not from a fully balanced design. The way this will show up in your output is that you will see the between subject section showing withing subject variables.

A solution for this might be to use the Anova function from library car with parameter type=”III”. But before doing that, first make sure you understand the difference between SS type I, II and III. Here is a good tutorial for helping you out with that.
By the way, these links are also useful in case you want to do a simple two way ANOVA for unbalanced design

I will “later” add R-help mailing list discussions that I found helpful on the subject.

If you come across good resources, please let me know about them in the comments.

Correlation scatter-plot matrix for ordered-categorical data

Posted in R, statistics, visualization on April 7th, 2010 by Tal Galili – 9 Comments

When analyzing a questionnaire, one often wants to view the correlation between two or more Likert questionnaire item’s (for example: two ordered categorical vectors ranging from 1 to 5).

When dealing with several such Likert variable’s, a clear presentation of all the pairwise relation’s between our variable can be achieved by inspecting the (Spearman) correlation matrix (easily achieved in R by using the “cor.test” command on a matrix of variables).
Yet, a challenge appears once we wish to plot this correlation matrix. The challenge stems from the fact that the classic presentation for a correlation matrix is a scatter plot matrix – but scatter plots don’t (usually) work well for ordered categorical vectors since the dots on the scatter plot often overlap each other.

There are four solution for the point-overlap problem that I know of:

  1. Jitter the data a bit to give a sense of the “density” of the points
  2. Use a color spectrum to represent when a point actually represent “many points”
  3. Use different points sizes to represent when there are “many points” in the location of that point
  4. Add a LOWESS (or LOESS) line to the scatter plot – to show the trend of the data

In this post I will offer the code for the  a solution that uses solution 3-4 (and possibly 2, please read this post comments). Here is the output (click to see a larger image):

And here is the code to produce this plot:

read more »

A nice link: “Some hints for the R beginner”

Posted in R, R links on March 7th, 2010 by Tal Galili – Be the first to comment

Patrick Burns just posted to the mailing list the following massage:

There is now a document called “Some hints for the R beginner” whose purpose is to get people up and running with R as quickly as possible.

Direct access to it is:
http://www.burns-stat.com/pages/Tutor/hints_R_begin.html

JRR Tolkien wrote a story (sans hobbits) called ‘Leaf by Niggle’ that has always resonated with me. I offer you an imperfect, incomplete tree (but my roof is intact).

Suggestions for improvements are encouraged.

And here is the link tree for the document (for your easy reviewing of the offered content) :

read more »

Web Development with R – an HD video tutorial of Jeroen Ooms talk

Posted in R, R and the web, visualization on February 3rd, 2010 by Tal Galili – 2 Comments

Here is a HD version of a video tutorial on web development with R, a lecture that was given by Jeroen Ooms (the guy who made A web application for R’s ggplot2). This talk was given at the Bay Area UseR Group meeting on R-Powered Web Apps.

You can also view the slides for his talk and view (great) examples for: stockplotlme4, and gpplot2.

Thanks again to Jeroen for sharing his knowledge and experience!