<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>R-statistics blog &#187; tutorial</title>
	<atom:link href="http://www.r-statistics.com/tag/tutorial/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.r-statistics.com</link>
	<description>Writing about statistics with R, and open source stuff (software, data, community)</description>
	<lastBuildDate>Mon, 30 Jan 2012 07:45:09 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>data.frame objects in R (via &#8220;R in Action&#8221;)</title>
		<link>http://www.r-statistics.com/2011/12/data-frame-objects-in-r-via-r-in-action/</link>
		<comments>http://www.r-statistics.com/2011/12/data-frame-objects-in-r-via-r-in-action/#comments</comments>
		<pubDate>Sun, 18 Dec 2011 22:02:04 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[data frames]]></category>
		<category><![CDATA[data.frame]]></category>
		<category><![CDATA[R book]]></category>
		<category><![CDATA[R classes]]></category>
		<category><![CDATA[R in action]]></category>
		<category><![CDATA[R objects]]></category>
		<category><![CDATA[R tutorial]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=865</guid>
		<description><![CDATA[The followings introductory post is intended for new users of R.  It deals with R data frames: what they are, and how to create, view, and update them. This is a guest article by Dr. Robert I. Kabacoff, the founder of (one of) the first online R tutorials websites: Quick-R.  Kabacoff has recently published the book &#8221;R [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2011/12/data-frame-objects-in-r-via-r-in-action/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2011/12/data-frame-objects-in-r-via-r-in-action/"></g:plusone></div></div><p><strong>The followings introductory post is intended for new users of R.  It deals with R data frames: what they are, and how to create, view, and update them.</strong></p>
<p>This is a guest article by Dr. <a href="http://www.statmethods.net/about/author.html">Robert I. Kabacoff</a>, the founder of (one of) the first online R tutorials websites: <a href="http://www.statmethods.net/interface/index.html">Quick-R</a>.  Kabacoff has recently published the book &#8221;<strong><a href="http://affiliate.manning.com/idevaffiliate.php?id=1205&#038;url=21">R in Action</a></strong>&#8220;, providing a detailed walk-through for the R language based on various examples for illustrating R’s features (data manipulation, statistical methods, graphics, and so on&#8230;)</p>
<p><a href="http://affiliate.manning.com/idevaffiliate.php?id=1205&#038;url=21"><img src="http://www.r-statistics.com/wp-content/uploads/2011/12/kabacoff_cover150.jpg" alt="" title="R in Action cover image" width="150" height="188" class="alignleft size-full wp-image-874" /></a></p>
<p>For readers of this blog, there is a<strong> 38% discount</strong> off <a href="http://affiliate.manning.com/idevaffiliate.php?id=1205&#038;url=21">the &#8220;R in Action&#8221; book</a> (as well as all other eBooks, pBooks and MEAPs at <a href="http://affiliate.manning.com/idevaffiliate.php?id=1205">Manning publishing house</a>), simply by using the code <em><strong>rblogg38 </strong></em>when reaching checkout.</p>
<p>Let us now talk about data frames:<br />
<span id="more-865"></span><br />
<u><br />
<h3>Data Frames</h3>
<p></u><br />
A data frame is more general than a matrix in that different columns can contain different modes of data (numeric, character, and so on). It’s similar to the datasets you’d typically see in SAS, SPSS, and Stata. Data frames are the most common data structure you’ll deal with in R.</p>
<p>The patient dataset in table 1 consists of numeric and character data.</p>
<p><em>Table 1: A patient dataset</em></p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top" width="67">
<div>
<p>PatientID</p>
</div>
</td>
<td valign="top" width="78">
<div>
<p>AdmDate</p>
</div>
</td>
<td valign="top" width="42">
<div>
<p>Age</p>
</div>
</td>
<td valign="top" width="60">
<div>
<p>Diabetes</p>
</div>
</td>
<td valign="top" width="66">
<div>
<p>Status</p>
</div>
</td>
</tr>
<tr>
<td valign="top" width="67">1</td>
<td valign="top" width="78">10/15/2009</td>
<td valign="top" width="42">25</td>
<td valign="top" width="60">Type1</td>
<td valign="top" width="66">Poor</td>
</tr>
<tr>
<td valign="top" width="67">2</td>
<td valign="top" width="78">11/01/2009</td>
<td valign="top" width="42">34</td>
<td valign="top" width="60">Type2</td>
<td valign="top" width="66">Improved</td>
</tr>
<tr>
<td valign="top" width="67">3</td>
<td valign="top" width="78">10/21/2009</td>
<td valign="top" width="42">28</td>
<td valign="top" width="60">Type1</td>
<td valign="top" width="66">Excellent</td>
</tr>
<tr>
<td valign="top" width="67">4</td>
<td valign="top" width="78">10/28/2009</td>
<td valign="top" width="42">52</td>
<td valign="top" width="60">Type1</td>
<td valign="top" width="66">Poor</td>
</tr>
</tbody>
</table>
<p>Because there are multiple modes of data, you can’t contain this data in a matrix. In this case, a data frame would be the structure of choice.</p>
<p>A data frame is created with the data.frame() function:</p>

<div class="wp_codebox"><table><tr id="p86512"><td class="line_numbers"><pre>1
</pre></td><td class="code" id="p865code12"><pre class="rsplus" style="font-family:monospace;">mydata <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>col1, col2, col3,…<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>where <em>col1, col2, col3, </em>… are column vectors of any type (such as character, numeric, or logical). Names for each column can be provided with the names function.</p>
<p>The following listing makes this clear.</p>
<p><strong>Listing 1 Creating a data frame</strong></p>

<div class="wp_codebox"><table><tr id="p86513"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
</pre></td><td class="code" id="p865code13"><pre class="rsplus" style="font-family:monospace;"><span style="color: #080;">&gt;</span> patientID <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>, <span style="color: #ff0000;">2</span>, <span style="color: #ff0000;">3</span>, <span style="color: #ff0000;">4</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&gt;</span> age <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">25</span>, <span style="color: #ff0000;">34</span>, <span style="color: #ff0000;">28</span>, <span style="color: #ff0000;">52</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&gt;</span> diabetes <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Type1&quot;</span>, <span style="color: #ff0000;">&quot;Type2&quot;</span>, <span style="color: #ff0000;">&quot;Type1&quot;</span>, <span style="color: #ff0000;">&quot;Type1&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&gt;</span> status <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Poor&quot;</span>, <span style="color: #ff0000;">&quot;Improved&quot;</span>, <span style="color: #ff0000;">&quot;Excellent&quot;</span>, <span style="color: #ff0000;">&quot;Poor&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&gt;</span> patientdata <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>patientID, age, diabetes, status<span style="color: #080;">&#41;</span>
<span style="color: #080;">&gt;</span> patientdata
  patientID age diabetes status
<span style="color: #ff0000;">1</span>         <span style="color: #ff0000;">1</span>  <span style="color: #ff0000;">25</span>    Type1 Poor
<span style="color: #ff0000;">2</span>         <span style="color: #ff0000;">2</span>  <span style="color: #ff0000;">34</span>    Type2 Improved
<span style="color: #ff0000;">3</span>         <span style="color: #ff0000;">3</span>  <span style="color: #ff0000;">28</span>    Type1 Excellent
<span style="color: #ff0000;">4</span>         <span style="color: #ff0000;">4</span>  <span style="color: #ff0000;">52</span>    Type1 Poor</pre></td></tr></table></div>

<p>Each column must have only one mode, but you can put columns of different modes together to form the data frame. Because data frames are close to what analysts typically think of as datasets, we’ll use the terms columns and variables interchangeably when discussing data frames.</p>
<p>There are several ways to identify the elements of a data frame. You can use the subscript notation or you can specify column names. Using the patientdata data frame created earlier, the following listing demonstrates these approaches.</p>
<p><strong>Listing 2 Specifying elements of a data frame</strong></p>

<div class="wp_codebox"><table><tr id="p86514"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
</pre></td><td class="code" id="p865code14"><pre class="rsplus" style="font-family:monospace;"><span style="color: #080;">&gt;</span> patientdata<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>
  patientID age
<span style="color: #ff0000;">1</span>         <span style="color: #ff0000;">1</span>  <span style="color: #ff0000;">25</span>
<span style="color: #ff0000;">2</span>         <span style="color: #ff0000;">2</span>  <span style="color: #ff0000;">34</span>
<span style="color: #ff0000;">3</span>         <span style="color: #ff0000;">3</span>  <span style="color: #ff0000;">28</span>
<span style="color: #ff0000;">4</span>         <span style="color: #ff0000;">4</span>  <span style="color: #ff0000;">52</span>
<span style="color: #080;">&gt;</span> patientdata<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;diabetes&quot;</span>, <span style="color: #ff0000;">&quot;status&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span>
  diabetes status
<span style="color: #ff0000;">1</span>    Type1 Poor
<span style="color: #ff0000;">2</span>    Type2 Improved
<span style="color: #ff0000;">3</span>    Type1 Excellent 
<span style="color: #ff0000;">4</span>    Type1 Poor
<span style="color: #080;">&gt;</span> patientdata$age    <span style="color: #228B22;">#age variable in the patient data frame</span>
<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span> <span style="color: #ff0000;">25</span> <span style="color: #ff0000;">34</span> <span style="color: #ff0000;">28</span> <span style="color: #ff0000;">52</span></pre></td></tr></table></div>

<p>The $ notation in the third example is used to indicate a particular variable from a given data frame. For example, if you want to cross-tabulate diabetes type by status, you could use the following code:</p>

<div class="wp_codebox"><table><tr id="p86515"><td class="line_numbers"><pre>1
2
3
4
5
</pre></td><td class="code" id="p865code15"><pre class="rsplus" style="font-family:monospace;"><span style="color: #080;">&gt;</span> <span style="color: #0000FF; font-weight: bold;">table</span><span style="color: #080;">&#40;</span>patientdata$diabetes, patientdata$status<span style="color: #080;">&#41;</span>
&nbsp;
        Excellent Improved Poor
  Type1         <span style="color: #ff0000;">1</span>        <span style="color: #ff0000;">0</span>    <span style="color: #ff0000;">2</span>
  Type2         <span style="color: #ff0000;">0</span>        <span style="color: #ff0000;">1</span>    <span style="color: #ff0000;">0</span></pre></td></tr></table></div>

<p>It can get tiresome typing patientdata$ at the beginning of every variable name, so shortcuts are available. You can use either the attach() and detach() or with() functions to simplify your code.</p>
<h3>attach, detach, and with</h3>
<p>The attach() function adds the data frame to the R search path. When a variable name is encountered, data frames in the search path are checked in order to locate the variable. Using a sample (mtcars) data frame, you could use the following code to obtain summary statistics for automobile mileage (mpg), and plot this variable against engine displacement (disp), and weight (wt):</p>

<div class="wp_codebox"><table><tr id="p86516"><td class="line_numbers"><pre>1
2
3
</pre></td><td class="code" id="p865code16"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">summary</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span>$mpg<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span>$mpg, <span style="color: #CC9900; font-weight: bold;">mtcars</span>$disp<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span>$mpg, <span style="color: #CC9900; font-weight: bold;">mtcars</span>$wt<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>This could also be written as</p>

<div class="wp_codebox"><table><tr id="p86517"><td class="line_numbers"><pre>1
2
3
4
5
</pre></td><td class="code" id="p865code17"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">attach</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span><span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">summary</span><span style="color: #080;">&#40;</span>mpg<span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span>mpg, disp<span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span>mpg, wt<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">detach</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>The detach() function removes the data frame from the search path. Note that detach() does nothing to the data frame itself. The statement is optional but is good programming practice and should be included routinely.</p>
<p>The limitations with this approach are evident when more than one object can have the same name. Consider the following code:</p>

<div class="wp_codebox"><table><tr id="p86518"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
</pre></td><td class="code" id="p865code18"><pre class="rsplus" style="font-family:monospace;"><span style="color: #080;">&gt;</span> mpg <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">25</span>, <span style="color: #ff0000;">36</span>, <span style="color: #ff0000;">47</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&gt;</span> <span style="color: #0000FF; font-weight: bold;">attach</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span><span style="color: #080;">&#41;</span>
&nbsp;
The following object<span style="color: #080;">&#40;</span>s<span style="color: #080;">&#41;</span> are masked _by_ ‘.<span style="">GlobalEnv</span>’<span style="color: #080;">:</span> mpg
<span style="color: #080;">&gt;</span> <span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span>mpg, wt<span style="color: #080;">&#41;</span>
Error <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #0000FF; font-weight: bold;">xy.<span style="">coords</span></span><span style="color: #080;">&#40;</span>x, y, xlabel, ylabel, <span style="color: #0000FF; font-weight: bold;">log</span><span style="color: #080;">&#41;</span> <span style="color: #080;">:</span>
  ‘x’ and ‘y’ lengths differ
<span style="color: #080;">&gt;</span> mpg
<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span> <span style="color: #ff0000;">25</span> <span style="color: #ff0000;">36</span> <span style="color: #ff0000;">47</span></pre></td></tr></table></div>

<p>Here we already have an object named mpg in our environment when the mtcars data frame is attached. In such cases, the original object takes precedence, which isn’t what you want. The plot statement fails because mpg has 3 elements and disp has 32 elements. The attach() and detach() functions are best used when you’re analyzing a single data frame and you’re unlikely to have multiple objects with the same name. In any case, be vigilant for warnings that say that objects are being masked.</p>
<p>An alternative approach is to use the with() function. You could write the previous example as</p>

<div class="wp_codebox"><table><tr id="p86519"><td class="line_numbers"><pre>1
2
3
4
5
</pre></td><td class="code" id="p865code19"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">with</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span>, <span style="color: #080;">&#123;</span>
  <span style="color: #0000FF; font-weight: bold;">summary</span><span style="color: #080;">&#40;</span>mpg, disp, wt<span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span>mpg, disp<span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span>mpg, wt<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>In this case, the statements within the {} brackets are evaluated with reference to the mtcars data frame. You don’t have to worry about name conflicts here. If there’s only one statement (for example, summary(mpg)), the {} brackets are optional.</p>
<p>The limitation of the with() function is that assignments will only exist within the function brackets. Consider the following:</p>

<div class="wp_codebox"><table><tr id="p86520"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
</pre></td><td class="code" id="p865code20"><pre class="rsplus" style="font-family:monospace;"><span style="color: #080;">&gt;</span> <span style="color: #0000FF; font-weight: bold;">with</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span>, <span style="color: #080;">&#123;</span>
   stats <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">summary</span><span style="color: #080;">&#40;</span>mpg<span style="color: #080;">&#41;</span>
   stats
  <span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span>
   Min. 1st Qu. <span style="">Median</span> Mean 3rd Qu. <span style="">Max</span>.
  <span style="color: #ff0000;">10.40</span> <span style="color: #ff0000;">15.43</span> <span style="color: #ff0000;">19.20</span> <span style="color: #ff0000;">20.09</span> <span style="color: #ff0000;">22.80</span> <span style="color: #ff0000;">33.90</span>
<span style="color: #080;">&gt;</span> stats
Error<span style="color: #080;">:</span> object ‘stats’ not found</pre></td></tr></table></div>

<p>If you need to create objects that will exist outside of the with() construct, use the special assignment operator &lt;&lt;- instead of the standard one (&lt;-). It will save the object to the global environment outside of the with() call. This can be demonstrated with the following code:</p>

<div class="wp_codebox"><table><tr id="p86521"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
</pre></td><td class="code" id="p865code21"><pre class="rsplus" style="font-family:monospace;"><span style="color: #080;">&gt;</span> <span style="color: #0000FF; font-weight: bold;">with</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span>, <span style="color: #080;">&#123;</span>
   nokeepstats <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">summary</span><span style="color: #080;">&#40;</span>mpg<span style="color: #080;">&#41;</span>
   keepstats <span style="color: #080;">&lt;&lt;-</span> <span style="color: #0000FF; font-weight: bold;">summary</span><span style="color: #080;">&#40;</span>mpg<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&gt;</span> nokeepstats
Error<span style="color: #080;">:</span> object ‘nokeepstats’ not found
<span style="color: #080;">&gt;</span> keepstats
   Min. 1st Qu. <span style="">Median</span> Mean 3rd Qu. <span style="">Max</span>.
    <span style="color: #ff0000;">10.40</span> <span style="color: #ff0000;">15.43</span> <span style="color: #ff0000;">19.20</span> <span style="color: #ff0000;">20.09</span> <span style="color: #ff0000;">22.80</span> <span style="color: #ff0000;">33.90</span></pre></td></tr></table></div>

<p>Most books on R recommend using with() over attach(). I think that ultimately the choice is a matter of preference and should be based on what you’re trying to achieve and your understanding of the implications.</p>
<h3>Case identifiers</h3>
<p>In the patient data example, patientID is used to identify individuals in the dataset. In R, case identifiers can be specified with a rowname option in the data frame function. For example, the statement</p>

<div class="wp_codebox"><table><tr id="p86522"><td class="line_numbers"><pre>1
2
</pre></td><td class="code" id="p865code22"><pre class="rsplus" style="font-family:monospace;">patientdata <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>patientID, age, diabetes, status,
   <span style="color: #0000FF; font-weight: bold;">row.<span style="">names</span></span><span style="color: #080;">=</span>patientID<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>specifies patientID as the variable to use in labeling cases on various printouts and graphs produced by R.</p>
<h3>Summary</h3>
<p>One of the most challenging tasks in data analysis is data preparation. R provides various structures for holding data and many methods for importing data from both keyboard and external sources. One of those structures is data frames, which we covered here. Your ability to specify elements of these structures via the bracket notation is particularly important in selecting, subsetting, and transforming data.</p>
<p>R offers a wealth of functions for accessing external data. This includes data from flat files, web files, statistical packages, spreadsheets, and databases. Note that you can also export data from R into these external formats. We showed you how to use either the attach() and detach() or with() functions to simplify your code.</p>
<p><em>This article first appeared as chapter 2.2.4 from the &#8220;<a href="http://affiliate.manning.com/idevaffiliate.php?id=1205&#038;url=21">R in action</a><strong>&#8220;</strong> book, and is published with permission from <a href="http://affiliate.manning.com/idevaffiliate.php?id=1205">Manning publishing house</a>.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2011/12/data-frame-objects-in-r-via-r-in-action/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Book review: 25 Recipes for Getting Started with R</title>
		<link>http://www.r-statistics.com/2011/02/book-review-25-recipes-for-getting-started-with-r/</link>
		<comments>http://www.r-statistics.com/2011/02/book-review-25-recipes-for-getting-started-with-r/#comments</comments>
		<pubDate>Thu, 24 Feb 2011 13:45:29 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[book review]]></category>
		<category><![CDATA[introduction]]></category>
		<category><![CDATA[O’Reilly]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=646</guid>
		<description><![CDATA[Recently I was asked by O&#8217;Reilly publishing to give a book review for Paul Teetor new introductory book to R.  After giving the book some attention and appreciating it&#8217;s delivery of the material, I was happy to write and post this review.  Also, I&#8217;m very happy to see how a major publishing house like O&#8217;Reilly is producing more and [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2011/02/book-review-25-recipes-for-getting-started-with-r/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2011/02/book-review-25-recipes-for-getting-started-with-r/"></g:plusone></div></div><p>Recently I was asked by O&#8217;Reilly publishing to give a book review for Paul Teetor new introductory book to R.  After giving the book some attention and appreciating it&#8217;s delivery of the material, I was happy to write and post this review.  Also, I&#8217;m very happy to see how a major publishing house like O&#8217;Reilly is producing more and more R books, great news indeed.</p>
<p>And now for the book review:</p>
<p><strong>Executive summary:</strong> a book that offers a well designed gentle introduction for people with some background in statistics wishing to learn how to get common (basic) tasks done with R.</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2011/02/Getting-started-with-R-Book-cover.gif"><img class="size-full wp-image-648 alignright" title="Getting started with R - Book cover" src="http://www.r-statistics.com/wp-content/uploads/2011/02/Getting-started-with-R-Book-cover.gif" alt="" width="180" height="236" /></a></p>
<h3>Information</h3>
<p>By: Paul Teetor<br />
Publisher:O&#8217;Reilly<br />
MediaReleased: January 2011<br />
Pages: 58 (est.)</p>
<p><H3>Format</H3></p>
<p>The book &#8220;25 Recipes for Getting Started with R&#8221; offers an interesting take on how to bring R to the general (statistically oriented) public.</p>
<p><span id="more-646"></span></p>
<p>Instead of teaching R (or topics in statistics) in a systematic way, the author chose to assemble a likely set of cheat-sheet-like how-to tasks (&#8220;R recipes&#8221;) that a new user of R is assumed to encounter in their first steps of using R.  Tasks like: Installing R, finding help, reading data, selecting data, basic summary statistics, plotting some graphs, loading packages, and performing/diagnosing OLS regression.</p>
<p>These recipes were taken from the &#8220;R Cookbook&#8221; (O’Reilly) which contains over 200 such recipes.</p>
<p>Each of the 25 &#8220;R recipe&#8221; is comprised of four sections:</p>
<ul>
<li><strong>Problem </strong>- stating in one sentence what is the task we wish to accomplish.</li>
<li><strong>Solution </strong>- a direct solution to the problem presented in very few paragraphs (ranging from one paragraph up to a page)</li>
<li><strong>Discussion </strong>- an extension of the solution, offering several pages of variations and common pitfalls.</li>
<li><strong>See also</strong> &#8211; with reference for further information (not always present)</li>
</ul>
<p>The book is modest in it&#8217;s presumptions of scope (which I appreciate) and tries only to offer a bird&#8217;s eye view for statistically oriented, first time (short on time) users, wanting to feel they can get to do &#8220;something&#8221; using R.</p>
<h3>Audience</h3>
<p>I can imagine a first year student (or an IT professional with some stats background), benefiting from such a book if they have learned their stats with another package (like stata, <a title="SAS news" href="http://sas-x.com/">SAS </a>, SPSS and so on).</p>
<p>The books scope is both an advantage and a disadvantage, depending on the target audience.  I would find it surprising if experience R users will have much (or any) to gain from it, and it can not serve as a reference.  Although this might be a different case with the extended &#8220;R cookbook&#8221; (which I hope to get my hands on at this point or another, since I enjoyed the authors writing).</p>
<p>Lastly, I should mention that someone who is already well versed in SAS or SPSS would probably prefer Robert Muenchens superb book &#8221;<a href="http://www.springer.com/statistics/computanional+statistics/book/978-0-387-09417-5">R for SAS and SPSS Users</a>&#8221; in order to make the transition to R smoother.</p>
<h3>Content outline (with some notes)</h3>
<p>I added some notes to the chapter names.  I&#8217;d like to state again that my general impression of the book is good.  The points I make are mostly subtle and only placed to guide you in case you give the book as a gift to a friend, in case you might wish to emphasize some things to your friend that were not mentioned in this book.</p>
<p>The books content includes:</p>
<ul>
<li>Downloading and Installing R</li>
<li>Getting Help on a Function</li>
<li>Viewing the Supplied Documentation</li>
<li>Searching the Web for Help &#8211; credit goes to the author for mentioning <a href="http://stats.stackexchange.com/">stats.stackexchange.com</a> and stackoverflow.com , while highlighting the use of <a href="http://stackoverflow.com/questions/tagged/r">the R tag on stackoverflow. </a>Although I wished he had mentions <a title="R news from blogs" href="http://www.r-bloggers.com/">R-bloggers</a> (edit: after corresponding with the author, he wrote to me that: <em>F.Y.I., I do mention R-bloggers in the full R Cookbook. The 25 Recipes book cannot contain as much useful information. In the Cookbook, I recommend that readers follow R-bloggers as a way to keep up with developments in the R community.)</em>.</li>
<li>Reading Tabular Datafiles &#8211; the author makes proper distinctions with how to menage factors vs characters.</li>
<li>Reading from CSV Files</li>
<li>Creating a Vector</li>
<li>Computing Basic Statistics &#8211; the author gives proper room for handling missing values.</li>
<li>Initializing a Data Frame from Column Data</li>
<li>Selecting Data Frame Columns by Position</li>
<li>Selecting Data Frame Columns by Name</li>
<li>Forming a Confidence Interval for a Mean</li>
<li>Forming a Confidence Interval for a Proportion</li>
<li>Comparing the Means of Two Samples</li>
<li>Testing a Correlation for Significance</li>
<li>Creating a Scatter Plot &#8211; I wish more attention would have been made to talking about  lattice (which was mentioned, twice, in the book) and ggplot2 (in the see also, discussion or the preface).  The same could have been said about many other procedures but I think graphics and R is a special case since it should be clear to the reader how R packages can extend it&#8217;s statistical procedures but the reader may not notice how there are R packages that extend it&#8217;s graphical capabilities as well.</li>
<li>Creating a Bar Chart</li>
<li>Creating a Box Plot</li>
<li>Creating a Histogram</li>
<li>Performing Simple Linear Regression</li>
<li>Performing Multiple Linear Regression &#8211; there might have been room to mention the existence of &#8220;I&#8221; (for example: y~x+I(x^2)) and interactions (&#8220;*&#8221;).</li>
<li>Getting Regression Statistics</li>
<li>Diagnosing a Linear Regression &#8211; this section include the command outlier.test which is based on the car package (and not in base R).   It would have probably been clearer if the author directed the reader to the section on using packages in the &#8220;see also&#8221; instead of only talking about install.pacakges (which wasn&#8217;t the place for it, IMHO).</li>
<li>Predicting New Values &#8211; I would have recommended to highlight the importance of retaining the same column names in the new data.frame since failing to do so results in a (quite common) failure of the function.</li>
<li>Accessing the Functions in a Package &#8211; I think this section should have been referenced more.  And also that the installation of new packages could have been inserted here.</li>
</ul>
<p>* * *</p>
<p>If you got to have a look at the book, I&#8217;d be very curious to read your thoughts about it in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2011/02/book-review-25-recipes-for-getting-started-with-r/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Tips for the R beginner (a 5 page overview)</title>
		<link>http://www.r-statistics.com/2010/08/tips-for-the-r-beginner-5-pages-overview/</link>
		<comments>http://www.r-statistics.com/2010/08/tips-for-the-r-beginner-5-pages-overview/#comments</comments>
		<pubDate>Mon, 23 Aug 2010 20:24:34 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[begginers]]></category>
		<category><![CDATA[Finance]]></category>
		<category><![CDATA[Finances]]></category>
		<category><![CDATA[tips]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[tutorials]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=521</guid>
		<description><![CDATA[In this post I publish a PDF document titled &#8220;A collection of tips for R in Finance&#8221;. It is a basic 5 page introduction to R in finances by Arnaud Amsellem (linked in profile). The article offers tips related to the following points: Code Editor Organizing R code Update packages Getting external data into R [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/08/tips-for-the-r-beginner-5-pages-overview/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/08/tips-for-the-r-beginner-5-pages-overview/"></g:plusone></div></div><p>In this post I publish a PDF document titled &#8220;A collection of tips for R in Finance&#8221;.<br />
It is a basic 5 page introduction to R in finances by Arnaud Amsellem (<a href="http://fr.linkedin.com/pub/arnaud-amsellem/9/33b/20a">linked in profile</a>).</p>
<p>The article offers tips related to the following points:</p>
<ul>
<li>Code Editor</li>
<li>Organizing R code</li>
<li>Update packages</li>
<li>Getting external data into R</li>
<li>Communicating with external applications</li>
<li>Optimizing R code</li>
</ul>
<p>This article is well articulated, and offers a perspective of someone who is experienced in the field and touches points that I can imagine beginners might otherwise overlook.  I hope publishing it here will be of use to some readers out there.</p>
<p>Update: as some readers have noted to me (by e-mail, and by commenting), this document touches very lightly on the topic of &#8220;finances&#8221; in R.  I therefore decided to update the title from &#8220;R in finance &#8211; some tips for beginners&#8221;, to it&#8217;s current form.</p>
<p><strong>Lastly</strong>: if you (a reader of this blog) feel you have an article (&#8220;post&#8221;) to contribute, but don&#8217;t feel like <a href="http://www.r-statistics.com/2010/07/blogging-about-r-presentation-and-audio/">starting your own blog</a>, feel welcome to <a href="http://www.r-statistics.com/contact-me/">contact me</a>, and I&#8217;ll be glad to post what you have to say on my blog (and subsequently, also on <a href="http://www.r-bloggers.com/">R bloggers</a>).</p>
<p>Here is the article:<br />
<span id="more-521"></span><br />
<p class="gde-text"><a href="http://www.r-statistics.com/wp-content/uploads/2010/08/A-collection-of-tips-for-R-in-Finance.pdf" target="_blank" class="gde-link">Download (PDF, 418.09KB)</a></p>
<iframe src="http://docs.google.com/viewer?url=http%3A%2F%2Fwww.r-statistics.com%2Fwp-content%2Fuploads%2F2010%2F08%2FA-collection-of-tips-for-R-in-Finance.pdf&hl=en_US&embedded=true" class="gde-frame" style="width:500px; height:700px; border: none;" scrolling="no"></iframe>

</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/08/tips-for-the-r-beginner-5-pages-overview/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Rose plot using Deducers ggplot2 plot builder</title>
		<link>http://www.r-statistics.com/2010/08/rose-plot-using-deducers-ggplot2-plot-builder/</link>
		<comments>http://www.r-statistics.com/2010/08/rose-plot-using-deducers-ggplot2-plot-builder/#comments</comments>
		<pubDate>Mon, 16 Aug 2010 22:35:52 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[deducer]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[GUI]]></category>
		<category><![CDATA[Hadley Wickham]]></category>
		<category><![CDATA[Ian fellows]]></category>
		<category><![CDATA[interfaces]]></category>
		<category><![CDATA[plot builder]]></category>
		<category><![CDATA[R GUI]]></category>
		<category><![CDATA[SPSS]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[tutorials]]></category>
		<category><![CDATA[videos]]></category>
		<category><![CDATA[youtube]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=517</guid>
		<description><![CDATA[The (excellent!) LearnR blog had a post today about making a rose plot in ggplot2. Following today&#8217;s announcement, by Ian Fellows, regarding the release of the new version of Deducer (0.4) offering a strong support for ggplot2 using a GUI plot builder, Ian also sent an e-mail where he shows how to create a rose [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/08/rose-plot-using-deducers-ggplot2-plot-builder/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/08/rose-plot-using-deducers-ggplot2-plot-builder/"></g:plusone></div></div><p>The (excellent!) <a href="http://learnr.wordpress.com/2010/08/16/consultants-chart-in-ggplot2/">LearnR blog had a post today</a> about making a rose plot in<br />
<a href="http://had.co.nz/ggplot2/">ggplot2</a>.</p>
<p>Following today&#8217;s announcement, by <a href="http://www.deducer.org/pmwiki/index.php/">Ian Fellows</a>, regarding <a href="http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/">the release of the new version of Deducer (0.4)</a> offering a strong support for ggplot2 using a GUI plot builder,  Ian also sent an e-mail where he shows how to create a rose plot using the new ggplot2 GUI included in the latest version of Deducer.  After the template is made, the plot can be generated with 4 clicks of the mouse.</p>
<p>Here is a video tutorial (Ian published) to show how this can be used:</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/CHYATHLM5sY?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/CHYATHLM5sY?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>The generated template file is available at:<br />
<a href="http://neolab.stat.ucla.edu/cranstats/rose.ggtmpl">http://neolab.stat.ucla.edu/cranstats/rose.ggtmpl</a></p>
<p>I am excited about the work Ian is doing, and hope to see more people publish use cases with Deducer.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/08/rose-plot-using-deducers-ggplot2-plot-builder/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ggplot2 plot builder is now on CRAN! (through Deducer 0.4 GUI for R)</title>
		<link>http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/</link>
		<comments>http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/#comments</comments>
		<pubDate>Mon, 16 Aug 2010 18:53:03 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[deducer]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[google summer of code]]></category>
		<category><![CDATA[GUI]]></category>
		<category><![CDATA[Hadley Wickham]]></category>
		<category><![CDATA[Ian fellows]]></category>
		<category><![CDATA[interfaces]]></category>
		<category><![CDATA[plot builder]]></category>
		<category><![CDATA[R GUI]]></category>
		<category><![CDATA[SPSS]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[tutorials]]></category>
		<category><![CDATA[videos]]></category>
		<category><![CDATA[youtube]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=507</guid>
		<description><![CDATA[Ian fellows, a hard working contributer to the R community (and a cool guy), has announced today the release of Deducer (0.4) to CRAN (scheduled to update in the next day or so). This major update also includes the release of a new plug-in package (DeducerExtras), containing additional dialogs and functionality. Following is the e-mail [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/"></g:plusone></div></div><p>Ian fellows, a hard working contributer to the R community (and a cool guy), has announced today the release of <a href="http://www.deducer.org/pmwiki/pmwiki.php?n=Main.DeducerManual">Deducer </a>(0.4) to <a href="http://cran.r-project.org/web/packages/Deducer/index.html">CRAN</a> (scheduled to update in the next day or so).<br />
This major update also includes the release of a new plug-in package (DeducerExtras), containing additional dialogs and functionality.</p>
<p>Following is the e-mail he sent out with all the details and demo videos.</p>
<p><span id="more-507"></span></p>
<h3>Deducer</h3>
<p>Deducer is designed to be a free easy to use alternative to proprietary data analysis software such as SPSS, JMP, and Minitab. It has a menu system to do common data manipulation and analysis tasks, and an excel-like spreadsheet in which to view and edit data frames. The goal of the project is two fold.</p>
<p>Provide an intuitive interface so that non-technical users can learn and perform analyses without programming getting in their way.<br />
Increase the efficiency of expert R users when performing common tasks by replacing hundreds of keystrokes with a few mouse clicks. Also, as much as possible the GUI should not get in their way if they just want to do some programming.<br />
Deducer is designed to be used with the Java based R console JGR, though it supports a number of other R environments (e.g. Windows RGUI and RTerm).</p>
<p>For those not familiar with Deducer, an online manual is available at: <a href="http://www.deducer.org/pmwiki/pmwiki.php?n=Main.DeducerManual">http://www.deducer.org/pmwiki/pmwiki.php?n=Main.DeducerManual</a></p>
<p>An introductory tour of Deducer (4.5 min):</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/iZ857h2j6wA?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/iZ857h2j6wA?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>There is also an &#8220;expert users introsuction&#8221; (8 min)</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/AjLToyuluSM?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/AjLToyuluSM?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<h3>ggplot2 Plot Builder</h3>
<p>The major change to Deducer is the inclusion of a new plotting GUI built on the ggplot2 package. This Google Summer of Code project provides an easy to use system to make anything from simple histograms, to custom publication ready graphics. Feel free to check out the video introduction:</p>
<p>Part 1 (6 min):</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/-Rym6Ucraes?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/-Rym6Ucraes?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Part 2 (6 min): </p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/k6elEgB3OCE?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/k6elEgB3OCE?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Additional videos:<br />
Templates (5 min):</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/ktdifzqbLW8?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/ktdifzqbLW8?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Extending the Builder (4 min):</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/RsxOo0jx0II?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/RsxOo0jx0II?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<h3>Deducer Extras</h3>
<p>The DeducerExtras package is an add-on package containing a variety of additional analysis dialogs. These include:</p>
<ul>
<li>Distribution quantiles</li>
<li>Single/multiple sample proportion tests</li>
<li>Paired t-test, and wilcoxon signed rank test</li>
<li>Levene&#8217;s test and bartlett&#8217;s test</li>
<li>K-means clustering</li>
<li>Hierarchical clustering</li>
<li>Factor analysis</li>
<li>Multi-dimensional scaling</li>
</ul>
<p>Introduction to Deducer Extras (~2 min): </p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/UCrhxB8tSJY?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/UCrhxB8tSJY?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<h3>Final thanks</h3>
<p>I would like to take this opportunity to thank the R community for choosing this project for a Google Summer of Code grant, and for the support and encouragement. In particular I would like to thank Hadley Wickham for mentoring the Plot Builder GUI, and Dirk Eddelbuettel for his organization of students and mentors.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Repeated measures ANOVA with R (functions and tutorials)</title>
		<link>http://www.r-statistics.com/2010/04/repeated-measures-anova-with-r-tutorials/</link>
		<comments>http://www.r-statistics.com/2010/04/repeated-measures-anova-with-r-tutorials/#comments</comments>
		<pubDate>Tue, 13 Apr 2010 08:10:51 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[R links]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[ANOVA]]></category>
		<category><![CDATA[aov]]></category>
		<category><![CDATA[car]]></category>
		<category><![CDATA[ez]]></category>
		<category><![CDATA[ezANOVA]]></category>
		<category><![CDATA[friedman]]></category>
		<category><![CDATA[friedman test]]></category>
		<category><![CDATA[friedman's test]]></category>
		<category><![CDATA[repeated measures]]></category>
		<category><![CDATA[repeated measures anova]]></category>
		<category><![CDATA[Repeated measuresANOVA]]></category>
		<category><![CDATA[Repeatedmeasures]]></category>
		<category><![CDATA[SS]]></category>
		<category><![CDATA[SS type I]]></category>
		<category><![CDATA[SS type I error]]></category>
		<category><![CDATA[SS type III]]></category>
		<category><![CDATA[SS type III error]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[Unbalanced design]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=236</guid>
		<description><![CDATA[Repeated measures ANOVA is a common task for the data analyst. There are (at least) two ways of performing &#8220;repeated measures ANOVA&#8221; using R but none is really trivial, and each way has it&#8217;s own complication/pitfalls (explanation/solution to which I was usually able to find through searching in the R-help mailing list). So for future [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/04/repeated-measures-anova-with-r-tutorials/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/04/repeated-measures-anova-with-r-tutorials/"></g:plusone></div></div><p>Repeated measures ANOVA is a common task for the data analyst.</p>
<p>There are (at least) two ways of performing &#8220;repeated measures ANOVA&#8221; using R but none is really trivial, and each way has it&#8217;s own complication/pitfalls (explanation/solution to which I was usually able to find through searching in the R-help mailing list).</p>
<p>So for future reference, I am starting this page to document links I find to tutorials, explanations (and troubleshooting) of &#8220;repeated measure ANOVA&#8221; done with R</p>
<h3>Functions and packages</h3>
<p>(I suggest using the tutorials supplied bellow for how to use these functions)</p>
<ul>
<li>aov {stats} &#8211; offers SS type I repeated measures anova,  by a call to lm for each stratum.  A short example is given in the ?aov help file</li>
<li>Anova {<a href="http://cran.r-project.org/web/packages/car/index.html">car</a>} &#8211; Calculates type-II or type-III analysis-of-variance tables for model objects produced by lm, and for various other object.  The ?Anova help file offers an example for how to use this for repeated measures</li>
<li>ezANOVA {<a href="http://cran.r-project.org/web/packages/ez/index.html">ez</a>} &#8211; This function provides easy analysis of data from factorial experiments, including purely within-Ss designs (a.k.a. &#8220;repeated measures&#8221;), purely between-Ss designs, and mixed within-and-between-Ss designs, yielding ANOVA results and assumption checks.  It is a wrapper of the Anova {car} function, and is easier to use.  The ez package also offers the functions ezPlot and ezStats to give plot and statistics of the ANOVA analysis.  The ?ezANOVA help file gives a good demonstration for the functions use (My thanks goes to Matthew Finkbe for letting me know about this cool package)</li>
<li>friedman.test {stats} &#8211; Performs a Friedman rank sum test with unreplicated blocked data.  That is, a non-parametric one-way repeated measures anova.  I also wrote a wrapper function to perform and plot <a href="http://www.r-statistics.com/2010/02/post-hoc-analysis-for-friedmans-test-r-code/">a post-hoc analysis on the friedman test results</a></li>
<li>Non parametric multi way repeated measures anova &#8211; I believe such a function could be developed based on the Proportional Odds Model, maybe using the {repolr} or the {ordinal} packages.  But I still didn&#8217;t come across any function that implements these models (if you do &#8211; please let me know in the comments).</li>
<li>Repeated measures, non-parametric, multivariate analysis of variance &#8211; as far as I know, such a method is not currently available in R.  There is, however, the Analysis of similarities (ANOSIM) analysis which provides a way to test statistically whether there is a signiﬁcantdifference between two or more groups of sampling units.  Is is available in the {<a href="http://cran.r-project.org/web/packages/vegan/vegan.pdf">vegan</a>} package through the &#8220;anosim&#8221; function.  There is also a <a href="http://cc.oulu.fi/~jarioksa/opetus/metodi/vegantutor.pdf">tutorial </a>and <a href="http://onlinelibrary.wiley.com/doi/10.1111/j.1442-9993.2001.01070.pp.x/full">a relevant published paper</a>.</li>
</ul>
<h3>Good Tutorials</h3>
<ul>
<li>A basic tutorial about ANOVA with R (only the last bit holds some example of repeated measures) on <a href="http://www.personality-project.org/R/r.anova.html">personality-project</a></li>
<li><a href="http://www.personality-project.org/R/r.anova.html"></a>A thorough tutorial on <a href="http://gribblelab.org/2009/03/09/repeated-measures-anova-using-r/">motor control lab</a></li>
<li><a href="http://blog.gribblelab.org/2009/03/09/repeated-measures-anova-using-r/"></a>A thorough tutorial on <a href="http://www.ats.ucla.edu/stat/R/seminars/Repeated_Measures/repeated_measures.htm">UCLA seminar page</a></li>
<li><a href="http://www.psych.upenn.edu/~baron/rpsych/rpsych.html#htoc60">Another good tutorial</a> by<br />
Jonathan Baron and Yuelin Li on <a href="http://www.psych.upenn.edu/~baron/rpsych/rpsych.html">&#8220;Notes on the use of R for psychology experiments and questionnaires&#8221;</a></li>
</ul>
<h3>Troubelshooting</h3>
<p><strong>Unbalanced design</strong><br />
Unbalanced design doesn&#8217;t work when doing  repeated measures ANOVA with aov, it just doesn&#8217;t.  This situation occurs if there are missing values in the data or that the data is not from a fully balanced design.  The way this will show up in your output is that you will see the between subject section showing withing subject variables.</p>
<p>A solution for this might be to use the<a href="http://finzi.psych.upenn.edu/R/library/car/html/Anova.html"> Anova</a> function from library car with parameter type=&#8221;III&#8221;.  But before doing that, first make sure you understand the difference between SS type I, II and III. <a href="http://prometheus.scp.rochester.edu/zlab/sites/default/files/InteractionsAndTypesOfSS.pdf">Here is a good tutorial</a> for helping you out with that.<br />
By the way, these links are also useful in case you want to do a simple two way ANOVA for unbalanced design</p>
<p>I will &#8220;later&#8221; add R-help mailing list discussions that I found helpful on the subject.</p>
<p>If you come across good resources, please let me know about them in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/04/repeated-measures-anova-with-r-tutorials/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Correlation scatter-plot matrix for ordered-categorical data</title>
		<link>http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/</link>
		<comments>http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/#comments</comments>
		<pubDate>Wed, 07 Apr 2010 21:37:26 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[correlation]]></category>
		<category><![CDATA[correlation matrix]]></category>
		<category><![CDATA[correlation scatter plot]]></category>
		<category><![CDATA[non-parametric]]></category>
		<category><![CDATA[non-parametric test]]></category>
		<category><![CDATA[nonparametric]]></category>
		<category><![CDATA[nonparametric test]]></category>
		<category><![CDATA[R code]]></category>
		<category><![CDATA[scatter plot]]></category>
		<category><![CDATA[scatter plot matrix]]></category>
		<category><![CDATA[spearman correlation]]></category>
		<category><![CDATA[spearman test]]></category>
		<category><![CDATA[stackoverflow]]></category>
		<category><![CDATA[survey]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=256</guid>
		<description><![CDATA[When analyzing a questionnaire, one often wants to view the correlation between two or more Likert questionnaire item&#8217;s (for example: two ordered categorical vectors ranging from 1 to 5). When dealing with several such Likert variable&#8217;s, a clear presentation of all the pairwise relation&#8217;s between our variable can be achieved by inspecting the (Spearman) correlation [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/"></g:plusone></div></div><p>When analyzing a questionnaire, one often wants to view the correlation between two or more <a href="http://en.wikipedia.org/wiki/Likert_scale">Likert questionnaire</a> item&#8217;s (for example: two ordered categorical vectors ranging from 1 to 5).</p>
<p>When dealing with several such Likert variable&#8217;s, a clear presentation of all the pairwise relation&#8217;s between our variable can be achieved by inspecting the (Spearman) correlation matrix (easily achieved in R by using the &#8220;cor.test&#8221; command on a matrix of variables).<br />
Yet, a challenge appears once we wish to plot this correlation matrix.  The challenge stems from the fact that the classic presentation for a correlation matrix is a <strong>scatter plot matrix</strong> &#8211; but scatter plots don&#8217;t (usually) work well for ordered categorical vectors since the dots on the scatter plot often overlap each other.</p>
<p>There are four solution for the point-overlap problem that I know of:</p>
<ol>
<li>Jitter the data a bit to give a sense of the &#8220;density&#8221; of the points</li>
<li>Use a color spectrum to represent when a point actually represent &#8220;many points&#8221;</li>
<li>Use different points sizes to represent when there are &#8220;many points&#8221; in the location of that point</li>
<li>Add a LOWESS (or LOESS) line to the scatter plot &#8211; to show the trend of the data</li>
</ol>
<p>In this post I will offer the code for the  a solution that uses solution 3-4 (and possibly 2, please read this post comments). Here is the output (click to see a larger image):</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/04/scatter-plot-correlation-matrix.png"><img class="alignnone size-full wp-image-257" title="scatter plot correlation matrix" src="http://www.r-statistics.com/wp-content/uploads/2010/04/scatter-plot-correlation-matrix.png" alt="" width="550"/></a></p>
<p>And here is the code to produce this plot:</p>
<p><span id="more-256"></span></p>
<h3>R code for producing a Correlation scatter-plot matrix &#8211; for ordered-categorical data</h3>
<p><strong>Note</strong> that this code will work fine for continues data points (although I might suggest to enlarge the &#8220;point.size.rescale&#8221; parameter to something bigger then 1.5 in the &#8220;panel.smooth.ordered.categorical&#8221; function)</p>

<div class="wp_codebox"><table><tr id="p25624"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
</pre></td><td class="code" id="p256code24"><pre class="rsplus" style="font-family:monospace;"><span style="color: #228B22;"># -----------------</span>
<span style="color: #228B22;"># Functions</span>
<span style="color: #228B22;"># -----------------</span>
&nbsp;
panel.<span style="">cor</span>.<span style="">ordered</span>.<span style="">categorical</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x, y, digits<span style="color: #080;">=</span><span style="color: #ff0000;">2</span>, prefix<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;&quot;</span>, cex.<span style="">cor</span><span style="color: #080;">&#41;</span> 
<span style="color: #080;">&#123;</span>
&nbsp;
    usr <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;usr&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">;</span> <span style="color: #0000FF; font-weight: bold;">on.<span style="">exit</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
    <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">1</span>, <span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
&nbsp;
    r <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">abs</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">cor</span><span style="color: #080;">&#40;</span>x, y, method <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;spearman&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notive we use spearman, non parametric correlation here</span>
    r.<span style="">no</span>.<span style="">abs</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cor</span><span style="color: #080;">&#40;</span>x, y, method <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;spearman&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
&nbsp;
    txt <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">format</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>r.<span style="">no</span>.<span style="">abs</span> , <span style="color: #ff0000;">0.123456789</span><span style="color: #080;">&#41;</span>, digits<span style="color: #080;">=</span>digits<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span> 
    txt <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span>prefix, txt, sep<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span> 
    <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">missing</span><span style="color: #080;">&#40;</span>cex.<span style="">cor</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> cex <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">0.8</span><span style="color: #080;">/</span><span style="color: #0000FF; font-weight: bold;">strwidth</span><span style="color: #080;">&#40;</span>txt<span style="color: #080;">&#41;</span> 
&nbsp;
    test <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cor.<span style="">test</span></span><span style="color: #080;">&#40;</span>x,y, method <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;spearman&quot;</span><span style="color: #080;">&#41;</span> 
    <span style="color: #228B22;"># borrowed from printCoefmat</span>
    Signif <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">symnum</span><span style="color: #080;">&#40;</span>test$p.<span style="">value</span>, corr <span style="color: #080;">=</span> FALSE, na <span style="color: #080;">=</span> FALSE, 
                  cutpoints <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">0.001</span>, <span style="color: #ff0000;">0.01</span>, <span style="color: #ff0000;">0.05</span>, <span style="color: #ff0000;">0.1</span>, <span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>,
                  <span style="color: #0000FF; font-weight: bold;">symbols</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;***&quot;</span>, <span style="color: #ff0000;">&quot;**&quot;</span>, <span style="color: #ff0000;">&quot;*&quot;</span>, <span style="color: #ff0000;">&quot;.&quot;</span>, <span style="color: #ff0000;">&quot; &quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
&nbsp;
    <span style="color: #0000FF; font-weight: bold;">text</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0.5</span>, <span style="color: #ff0000;">0.5</span>, txt, cex <span style="color: #080;">=</span> cex <span style="color: #080;">*</span> r<span style="color: #080;">&#41;</span> 
    <span style="color: #0000FF; font-weight: bold;">text</span><span style="color: #080;">&#40;</span>.8, .8, Signif, cex<span style="color: #080;">=</span>cex, <span style="color: #0000FF; font-weight: bold;">col</span><span style="color: #080;">=</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span> 
<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
&nbsp;
&nbsp;
panel.<span style="">smooth</span>.<span style="">ordered</span>.<span style="">categorical</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span> <span style="color: #080;">&#40;</span>x, y, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;col&quot;</span><span style="color: #080;">&#41;</span>, bg <span style="color: #080;">=</span> NA, pch <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;pch&quot;</span><span style="color: #080;">&#41;</span>, 
												cex <span style="color: #080;">=</span> <span style="color: #ff0000;">1</span>, col.<span style="">smooth</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;red&quot;</span>, span <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">/</span><span style="color: #ff0000;">3</span>, iter <span style="color: #080;">=</span> <span style="color: #ff0000;">3</span>, 
												point.<span style="">size</span>.<span style="">rescale</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.5</span>, ...<span style="color: #080;">&#41;</span> 
<span style="color: #080;">&#123;</span>
	<span style="color: #228B22;">#require(colorspace)</span>
    <span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">reshape</span><span style="color: #080;">&#41;</span>
    z <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">merge</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>x,y<span style="color: #080;">&#41;</span>, melt<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">table</span><span style="color: #080;">&#40;</span>x ,y<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">sort</span> <span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span><span style="color: #080;">&#41;</span>$value
    <span style="color: #228B22;">#the.col &lt;- heat_hcl(length(x))[z]</span>
    z <span style="color: #080;">&lt;-</span> point.<span style="">size</span>.<span style="">rescale</span><span style="color: #080;">*</span>z<span style="color: #080;">/</span> <span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notice how we rescale the dots accourding to the maximum z could have gotten</span>
&nbsp;
    <span style="color: #0000FF; font-weight: bold;">symbols</span><span style="color: #080;">&#40;</span> x, y,  circles <span style="color: #080;">=</span> z,<span style="color: #228B22;">#rep(0.1, length(x)), #sample(1:2, length(x), replace = T) ,</span>
			inches<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span>, bg<span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;grey&quot;</span>,<span style="color: #228B22;">#the.col ,</span>
			fg <span style="color: #080;">=</span> bg, add <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
&nbsp;
    <span style="color: #228B22;"># points(x, y, pch = pch, col = col, bg = bg, cex = cex)</span>
    ok <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">is.<span style="">finite</span></span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span> <span style="color: #080;">&amp;</span> <span style="color: #0000FF; font-weight: bold;">is.<span style="">finite</span></span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">if</span> <span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">any</span><span style="color: #080;">&#40;</span>ok<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
        <span style="color: #0000FF; font-weight: bold;">lines</span><span style="color: #080;">&#40;</span>stats<span style="color: #080;">::</span><span style="color: #0000FF; font-weight: bold;">lowess</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#91;</span>ok<span style="color: #080;">&#93;</span>, y<span style="color: #080;">&#91;</span>ok<span style="color: #080;">&#93;</span>, f <span style="color: #080;">=</span> span, iter <span style="color: #080;">=</span> iter<span style="color: #080;">&#41;</span>, 
            <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> col.<span style="">smooth</span>, ...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
panel.<span style="">hist</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x, ...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
    usr <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;usr&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">;</span> <span style="color: #0000FF; font-weight: bold;">on.<span style="">exit</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>usr<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>, <span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">1.5</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#41;</span>
    h <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">hist</span><span style="color: #080;">&#40;</span>x, <span style="color: #0000FF; font-weight: bold;">plot</span> <span style="color: #080;">=</span> FALSE, br <span style="color: #080;">=</span> <span style="color: #ff0000;">20</span><span style="color: #080;">&#41;</span>
    breaks <span style="color: #080;">&lt;-</span> h$breaks<span style="color: #080;">;</span> nB <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>breaks<span style="color: #080;">&#41;</span>
    y <span style="color: #080;">&lt;-</span> h$counts<span style="color: #080;">;</span> y <span style="color: #080;">&lt;-</span> y<span style="color: #080;">/</span><span style="color: #0000FF; font-weight: bold;">max</span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">rect</span><span style="color: #080;">&#40;</span>breaks<span style="color: #080;">&#91;</span><span style="color: #080;">-</span>nB<span style="color: #080;">&#93;</span>, <span style="color: #ff0000;">0</span>, breaks<span style="color: #080;">&#91;</span><span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>, y, <span style="color: #0000FF; font-weight: bold;">col</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;orange&quot;</span>, ...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
pairs.<span style="">ordered</span>.<span style="">categorical</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>xx,...<span style="color: #080;">&#41;</span>
		<span style="color: #080;">&#123;</span>
			<span style="color: #0000FF; font-weight: bold;">pairs</span><span style="color: #080;">&#40;</span>xx , 
					diag.<span style="">panel</span> <span style="color: #080;">=</span> panel.<span style="">hist</span> ,
					lower.<span style="">panel</span><span style="color: #080;">=</span>panel.<span style="">smooth</span>.<span style="">ordered</span>.<span style="">categorical</span>,
					upper.<span style="">panel</span><span style="color: #080;">=</span>panel.<span style="">cor</span>.<span style="">ordered</span>.<span style="">categorical</span>,
					cex.<span style="">labels</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.5</span>, ...<span style="color: #080;">&#41;</span> 
		<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
&nbsp;
&nbsp;
<span style="color: #228B22;"># -----------------</span>
<span style="color: #228B22;"># Example</span>
<span style="color: #228B22;"># -----------------</span>
&nbsp;
<span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">666</span><span style="color: #080;">&#41;</span>
a1 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sample</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">5</span>, <span style="color: #ff0000;">100</span>, <span style="color: #0000FF; font-weight: bold;">replace</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
a2 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sample</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">5</span>, <span style="color: #ff0000;">100</span>, <span style="color: #0000FF; font-weight: bold;">replace</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
a3 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">round</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">jitter</span><span style="color: #080;">&#40;</span>a2, <span style="color: #ff0000;">7</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#41;</span>
	a3<span style="color: #080;">&#91;</span>a3 <span style="color: #080;">&lt;</span> <span style="color: #ff0000;">1</span> <span style="color: #080;">|</span> a3 <span style="color: #080;">&gt;</span> <span style="color: #ff0000;">5</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">3</span>
a4 <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">6</span><span style="color: #080;">-</span><span style="color: #0000FF; font-weight: bold;">round</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">jitter</span><span style="color: #080;">&#40;</span>a1, <span style="color: #ff0000;">7</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#41;</span>
	a4<span style="color: #080;">&#91;</span>a4 <span style="color: #080;">&lt;</span> <span style="color: #ff0000;">1</span> <span style="color: #080;">|</span> a4 <span style="color: #080;">&gt;</span> <span style="color: #ff0000;">5</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">3</span>
&nbsp;
aa <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>a1,a2,a3, a4<span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">reshape</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># plotting :)		</span>
pairs.<span style="">ordered</span>.<span style="">categorical</span><span style="color: #080;">&#40;</span>aa<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<h3> Credits: </h3>
<ul>
<li>The original R code for the correlation matrix plot was taken from <a href="http://addictedtor.free.fr/graphiques/graphcode.php?graph=137">R Graph Gallery</a> (The differences are: 1) The use of spearman correlation;  2) The adding of hist panel and;  3) The changing of points sizes</li>
<li>The idea to use symbols for changing the point sizes was <a href="http://stackoverflow.com/questions/2593643/correlation-scatter-matrix-plot-with-different-point-size-in-r">offered</a> by <a href="http://www.linkedin.com/pub/doug-y-barbo/2/356/416">Doug Y&#8217;barbo</a>.<br />
And also to<a href="http://dirk.eddelbuettel.com/"> Dirk Eddelbuettel </a>for offering to use cex (although I ended up not using that)</li>
</ul>
<p>If you got ideas on how to improve this code (or reproducing it with ggplot2 or lattice), please do so in the comments (or on your own blog, but be sure to let me know <img src='http://www.r-statistics.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />   )</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>A nice link: &#8220;Some hints for the R beginner&#8221;</title>
		<link>http://www.r-statistics.com/2010/03/nice-link-some-hints-for-the-r-beginner/</link>
		<comments>http://www.r-statistics.com/2010/03/nice-link-some-hints-for-the-r-beginner/#comments</comments>
		<pubDate>Sun, 07 Mar 2010 20:47:31 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[R links]]></category>
		<category><![CDATA[hints]]></category>
		<category><![CDATA[link]]></category>
		<category><![CDATA[r help]]></category>
		<category><![CDATA[r tips]]></category>
		<category><![CDATA[R tutorial]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=189</guid>
		<description><![CDATA[Patrick Burns just posted to the mailing list the following massage: There is now a document called &#8220;Some hints for the R beginner&#8221; whose purpose is to get people up and running with R as quickly as possible. Direct access to it is: http://www.burns-stat.com/pages/Tutor/hints_R_begin.html JRR Tolkien wrote a story (sans hobbits) called &#8216;Leaf by Niggle&#8217; [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/03/nice-link-some-hints-for-the-r-beginner/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/03/nice-link-some-hints-for-the-r-beginner/"></g:plusone></div></div><p>Patrick Burns just posted to the mailing list the following massage:</p>
<blockquote><p>There is now a document called &#8220;Some hints for the R beginner&#8221; whose purpose is to get people up and running with R as quickly as possible.</p>
<p>Direct access to it is:<br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html">http://www.burns-stat.com/pages/Tutor/hints_R_begin.html</a></p>
<p>JRR Tolkien wrote a story (sans hobbits) called &#8216;Leaf by Niggle&#8217; that has always resonated with me.  I offer you an imperfect, incomplete tree (but my roof is intact).</p>
<p>Suggestions for improvements are encouraged.</p></blockquote>
<p>And here is the link tree for the document (for your easy reviewing of the offered content) :</p>
<p><span id="more-189"></span></p>
<p><strong>This page has several sections, they can be put into the four categories: General, Objects, Actions, Help.</strong></p>
<p><strong>General<br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#intro">Introduction </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#blankscreen">Blank screen syndrome </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#langs">Misconceptions because of a previous language </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#compenv">Helpful computer environments </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#jargon">R vocabulary </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#epilogue">Epilogue </a></strong></p>
<p><strong>Objects<br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#keyobjects">Key objects </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#readingdata">Reading data into R </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#seeingobjects">Seeing objects </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#savingobjects">Saving objects </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#magic">Magic functions, magic objects </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#filetypes">Some file types </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#packages">Packages </a></p>
<p>Actions<br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#startup">What happens at R startup </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#keyactions">Key actions </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#errors">Errors and such </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#graphics">Graphics </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#vectorize">Vectorization </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#makemistakes">Make mistakes on purpose </a></p>
<p></strong></p>
<p><strong>Help<br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#readhelp">How to read a help file </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#search">Searching for functionality </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#documents">Some other documents </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#Rhelp">R-help mailing list </a></strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/03/nice-link-some-hints-for-the-r-beginner/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Web Development with R &#8211; an HD video tutorial of Jeroen Ooms talk</title>
		<link>http://www.r-statistics.com/2010/02/web-development-with-r-an-hd-video-tutorial-of-jeroen-ooms-talk/</link>
		<comments>http://www.r-statistics.com/2010/02/web-development-with-r-an-hd-video-tutorial-of-jeroen-ooms-talk/#comments</comments>
		<pubDate>Wed, 03 Feb 2010 07:35:20 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[R and the web]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[jeroen ooms]]></category>
		<category><![CDATA[lecture]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[video]]></category>
		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=73</guid>
		<description><![CDATA[Here is a HD version of a video tutorial on web development with R, a lecture that was given by Jeroen Ooms (the guy who made A web application for R’s ggplot2). This talk was given at the Bay Area UseR Group meeting on R-Powered Web Apps. You can also view the slides for his talk and [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/02/web-development-with-r-an-hd-video-tutorial-of-jeroen-ooms-talk/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/02/web-development-with-r-an-hd-video-tutorial-of-jeroen-ooms-talk/"></g:plusone></div></div><p>Here is a HD version of <strong>a video tutorial on web development with R</strong>, a lecture that was given by <a href="http://www.stat.ucla.edu/~jeroen/">Jeroen Ooms</a> (the guy who made <a title="A web application for R’s ggplot2" href="http://www.r-statistics.com/2009/12/a-web-application-of-rs-ggplot2/">A web application for R’s ggplot2</a>). This talk was given at <a href="http://blog.revolution-computing.com/2010/01/quick-thoughts-on-rpowered-web-apps.html">the Bay Area UseR Group meeting</a> on R-Powered Web Apps.</p>
<p><object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/7x0UdUghANI&#038;hl=en_US&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/7x0UdUghANI&#038;hl=en_US&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object></p>
<p>You can also view the <a href="http://www.stat.ucla.edu/~jeroen/files/barug2010.pdf">slides</a> for his talk and view (great) examples for: <a href="http://www.stat.ucla.edu/~jeroen/stockplot.html">stockplot</a>, <a href="http://www.stat.ucla.edu/~jeroen/lme4.html">lme4</a>, and <a href="http://www.stat.ucla.edu/~jeroen/ggplot2.html">gpplot2</a>.</p>
<p>Thanks again to Jeroen for sharing his knowledge and experience!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/02/web-development-with-r-an-hd-video-tutorial-of-jeroen-ooms-talk/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

