<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>R-statistics blog &#187; visualization</title>
	<atom:link href="http://www.r-statistics.com/on/visualization/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.r-statistics.com</link>
	<description>Writing about statistics with R, and open source stuff (software, data, community)</description>
	<lastBuildDate>Mon, 30 Jan 2012 07:45:09 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Interactive Graphics with the iplots Package (from &#8220;R in Action&#8221;)</title>
		<link>http://www.r-statistics.com/2012/01/interactive-graphics-with-the-iplots-package-from-r-in-action/</link>
		<comments>http://www.r-statistics.com/2012/01/interactive-graphics-with-the-iplots-package-from-r-in-action/#comments</comments>
		<pubDate>Tue, 24 Jan 2012 12:29:38 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[ihist]]></category>
		<category><![CDATA[interaction visualization]]></category>
		<category><![CDATA[iplot]]></category>
		<category><![CDATA[iplots]]></category>
		<category><![CDATA[manning]]></category>
		<category><![CDATA[R in action]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=913</guid>
		<description><![CDATA[The followings introductory post is intended for new users of R.  It deals with interactive visualization using R through the iplots package. This is a guest article by Dr. Robert I. Kabacoff, the founder of (one of) the first online R tutorials websites: Quick-R. Kabacoff has recently published the book &#8221;R in Action&#8220;, providing a detailed walk-through for the [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2012/01/interactive-graphics-with-the-iplots-package-from-r-in-action/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2012/01/interactive-graphics-with-the-iplots-package-from-r-in-action/"></g:plusone></div></div><p><strong>The followings introductory post is intended for new users of R.  It deals with interactive visualization using R through the iplots package.</strong></p>
<p>This is a guest article by Dr. <a href="http://www.statmethods.net/about/author.html">Robert I. Kabacoff</a>, the founder of (one of) the first online R tutorials websites: <a href="http://www.statmethods.net/interface/index.html">Quick-R</a>. Kabacoff has recently published the book &#8221;<strong><a href="http://affiliate.manning.com/idevaffiliate.php?id=1205&amp;url=21">R in Action</a></strong>&#8220;, providing a detailed walk-through for the R language based on various examples for illustrating R’s features (data manipulation, statistical methods, graphics, and so on&#8230;). In <a href="http://www.r-statistics.com/2011/12/data-frame-objects-in-r-via-r-in-action/">previous guest post</a>s by Kabacoff we introduced <a href="http://www.r-statistics.com/2011/12/data-frame-objects-in-r-via-r-in-action/">data.frame objects in R</a> and dealt with the <a href="http://www.r-statistics.com/2012/01/aggregation-and-restructuring-data-from-r-in-action/">Aggregation and Restructuring of data</a> (using base R functions and the reshape package).</p>
<p><a href="http://affiliate.manning.com/idevaffiliate.php?id=1205&amp;url=21"><img class="alignright" title="R in Action cover image" src="http://www.r-statistics.com/wp-content/uploads/2011/12/kabacoff_cover150.jpg" alt="" width="150" height="188" /></a>For readers of this blog, there is a<strong> 38% discount</strong> off <a href="http://affiliate.manning.com/idevaffiliate.php?id=1205&amp;url=21">the &#8220;R in Action&#8221; book</a> (as well as all other eBooks, pBooks and MEAPs at <a href="http://affiliate.manning.com/idevaffiliate.php?id=1205">Manning publishing house</a>), simply by using the code <em><strong>rblogg38 </strong></em>when reaching checkout.</p>
<p>Let us now talk about Interactive Graphics with the iplots Package:<br />
<img title="More..." src="http://www.r-statistics.com/wp-includes/js/tinymce/plugins/wordpress/img/trans.gif" alt="" /></p>
<p><span id="more-913"></span></p>
<h3><span style="text-decoration: underline;">Interactive Graphics with the iplots Package<br />
</span></h3>
<p>The base installation of R provides limited interactivity with graphs. You can modify graphs by issuing additional program statements, but there’s little that you can do to modify them or gather new information from them using the mouse. However, there are contributed packages that greatly enhance your ability to interact with the graphs you create—playwith, latticist, iplots, and rggobi. In this article, we’ll focus on functions provided by the iplots package. Be sure to install it before first use.</p>
<p>While playwith and latticist allow you to interact with a single graph, the iplots package takes interaction in a different direction. This package provides interactive mosaic plots, bar plots, box plots, parallel plots, scatter plots, and histograms that can be linked together and color brushed. This means that you can select and identify observations using the mouse, and highlighting observations in one graph will automatically highlight the same observations in all other open graphs. You can also use the mouse to obtain information about graphic objects such as points, bars, lines, and box plots.</p>
<p>The iplots package is implemented through Java and the primary functions are listed in table 1.</p>
<p><em>Table 1 <strong>iplot </strong>functions</em></p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="top" width="73">
<div>
<p>Function</p>
</div>
</td>
<td valign="top" width="186">
<div>
<p>Description</p>
</div>
</td>
</tr>
<tr>
<td valign="bottom" width="73">ibar()</td>
<td valign="bottom" width="186">Interactive bar chart</td>
</tr>
<tr>
<td valign="bottom" width="73">ibox()</td>
<td valign="bottom" width="186">Interactive box plot</td>
</tr>
<tr>
<td valign="bottom" width="73">ihist()</td>
<td valign="bottom" width="186">Interactive histogram</td>
</tr>
<tr>
<td valign="bottom" width="73">imap()</td>
<td valign="bottom" width="186">Interactive map</td>
</tr>
<tr>
<td valign="bottom" width="73">imosaic()</td>
<td valign="bottom" width="186">Interactive mosaic plot</td>
</tr>
<tr>
<td valign="bottom" width="73">ipcp()</td>
<td valign="bottom" width="186">Interactive parallel coordinates plot</td>
</tr>
<tr>
<td valign="bottom" width="73">iplot()</td>
<td valign="bottom" width="186">Interactive scatter plot</td>
</tr>
</tbody>
</table>
<p>To understand how iplots works, execute the code provided in listing 1.</p>
<p><em>Listing 1 iplots demonstration<br />
</em></p>

<div class="wp_codebox"><table><tr id="p9132"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="code" id="p913code2"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">library</span><span style="color: #080;">&#40;</span>iplots<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">attach</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span><span style="color: #080;">&#41;</span>
cylinders <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">factor</span><span style="color: #080;">&#40;</span>cyl<span style="color: #080;">&#41;</span>
gears <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">factor</span><span style="color: #080;">&#40;</span>gear<span style="color: #080;">&#41;</span>
transmission <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">factor</span><span style="color: #080;">&#40;</span>am<span style="color: #080;">&#41;</span>
ihist<span style="color: #080;">&#40;</span>mpg<span style="color: #080;">&#41;</span>
ibar<span style="color: #080;">&#40;</span>gears<span style="color: #080;">&#41;</span>
iplot<span style="color: #080;">&#40;</span>mpg, wt<span style="color: #080;">&#41;</span>
ibox<span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;mpg&quot;</span>, <span style="color: #ff0000;">&quot;wt&quot;</span>, <span style="color: #ff0000;">&quot;qsec&quot;</span>, <span style="color: #ff0000;">&quot;disp&quot;</span>, <span style="color: #ff0000;">&quot;hp&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
ipcp<span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;mpg&quot;</span>, <span style="color: #ff0000;">&quot;wt&quot;</span>, <span style="color: #ff0000;">&quot;qsec&quot;</span>, <span style="color: #ff0000;">&quot;disp&quot;</span>, <span style="color: #ff0000;">&quot;hp&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
imosaic<span style="color: #080;">&#40;</span>transmission, cylinders<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">detach</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>Six windows containing graphs will open. Rearrange them on the desktop so that each is visible (each can be resized if necessary). A portion of the display is provided in figure 1.</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2012/01/iplots-example.png"><img src="http://www.r-statistics.com/wp-content/uploads/2012/01/iplots-example-300x289.png" alt="" title="iplots-example" width="300" height="289" class="aligncenter size-medium wp-image-914" /></a></p>
<p><em>Figure 1 An <strong>iplots </strong>demonstration created by listing 1. Only four of the six windows are displayed to save room. In these graphs, the user has clicked on the three-gear bar in the bar chart window.</em></p>
<p>Now try the following:</p>
<ul>
<li>Click on the three-gear bar in the Barchart (gears) window. The bar will turn red. In addition, all cars with three-gear engines will be highlighted in the other graph windows.</li>
<li>Mouse down and drag to select a rectangular region of points in the Scatter plot (wt vs mpg) window. These points will be highlighted and the corresponding observations in every other graph window will also turn red.</li>
<li>Hold down the Ctrl key and move the mouse pointer over a point, bar, box plot, or line in one of the graphs. Details about that object will appear in a pop-up window.</li>
<li>Right-click on any object and note the options that are offered in the context menu. For example, you can right-click on the Boxplot (mpg) window and change the graph to a parallel coordinates plot (PCP).</li>
<li>You can drag to select more than one object (point, bar, and so on) or use Shift-click to select noncontiguous objects. Try selecting both the three- and five-gear bars in the Barchart (gears) window.</li>
</ul>
<p>The functions in the iplots package allow you to explore the variable distributions and relationships among variables in subgroups of observations that you select interactively. This can provide insights that would be difficult and time-consuming to obtain in other ways. For more information on the iplots package, visit the project website at <a href="http://rosuda.org/iplots/">http://rosuda.org/iplots/</a>.</p>
<h3>Summary</h3>
<p>In this article, we explored one of the several packages for dynamically interacting with graphs, iplots. This package allows you to interact directly with data in graphs, leading to a greater intimacy with your data and expanded opportunities for developing insights.</p>
<p><span style="text-decoration: underline;"><br />
</span></p>
<p><em>This article first appeared as chapter 16.4.4 from the &#8220;<a href="http://affiliate.manning.com/idevaffiliate.php?id=1205&amp;url=21">R in action</a><strong>&#8220;</strong> book, and is published with permission from <a href="http://affiliate.manning.com/idevaffiliate.php?id=1205">Manning publishing house</a>.  Other books in this serious which you might be interested in are (see the beginning of this post for a discount code):</em></p>
<ul>
<li><a href="http://affiliate.manning.com/idevaffiliate.php?id=1205&amp;url=22">Machine Learning in Action </a>by Peter Harrington</li>
</ul>
<ul>
<li><a href="http://affiliate.manning.com/idevaffiliate.php?id=1205&amp;url=23">Gnuplot in Action</a> (Understanding Data with Graphs) by Philipp K. Janert</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2012/01/interactive-graphics-with-the-iplots-package-from-r-in-action/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Diagram for a Bernoulli process (using R)</title>
		<link>http://www.r-statistics.com/2011/11/diagram-for-a-bernoulli-process-using-r/</link>
		<comments>http://www.r-statistics.com/2011/11/diagram-for-a-bernoulli-process-using-r/#comments</comments>
		<pubDate>Thu, 10 Nov 2011 12:44:41 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[Bernoulli process]]></category>
		<category><![CDATA[binomial distribution]]></category>
		<category><![CDATA[distribution]]></category>
		<category><![CDATA[statistical distribution]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=829</guid>
		<description><![CDATA[A Bernoulli process is a sequence of Bernoulli trials (the realization of n binary random variables), taking two values (0/1, Heads/Tails, Boy/Girl, etc&#8230;). It is often used in teaching introductory probability/statistics classes about the binomial distribution. When visualizing a Bernoulli process, it is common to use a binary tree diagram in order to show the [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2011/11/diagram-for-a-bernoulli-process-using-r/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2011/11/diagram-for-a-bernoulli-process-using-r/"></g:plusone></div></div><p>A Bernoulli process is a sequence of Bernoulli trials (the realization of n binary random variables), taking two values (0/1, Heads/Tails, Boy/Girl, etc&#8230;).  It is often used in teaching introductory probability/statistics classes about the binomial distribution.</p>
<p>When visualizing a Bernoulli process, it is common to use a binary tree diagram in order to show the progression of the process, as well as the various consequences of the trial.  We might also include the number of &#8220;successes&#8221;, and the probability for reaching a specific terminal node.</p>
<p>I wanted to be able to create such a diagram using R.  For this purpose I composed some code which uses the {<a href="http://cran.r-project.org/web/packages/diagram/">diagram</a>} R package.  The final function should allow one to create different sizes of diagrams, while allowing flexibility with regards to the text which is used in the tree.</p>
<p>Here is an example of the simplest use of the function:</p>

<div class="wp_codebox"><table><tr id="p8296"><td class="line_numbers"><pre>1
2
</pre></td><td class="code" id="p829code6"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game_.r.txt&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># loading the function</span>
binary.<span style="">tree</span>.<span style="">for</span>.<span style="">binomial</span>.<span style="">game</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># creating a tree for B(2,0.5)</span></pre></td></tr></table></div>

<p>The resulting diagram will look like this:</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game001.png"><img src="http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game001-300x257.png" alt="" title="binary.tree.for.binomial.game001" width="300" height="257" class="alignnone size-medium wp-image-832" /></a></p>
<p>The same can be done for creating larger trees.  For example, here is the code for a 4 stage Bernoulli process:</p>

<div class="wp_codebox"><table><tr id="p8297"><td class="line_numbers"><pre>1
2
</pre></td><td class="code" id="p829code7"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game_.r.txt&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># loading the function</span>
binary.<span style="">tree</span>.<span style="">for</span>.<span style="">binomial</span>.<span style="">game</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">4</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># creating a tree for B(4,0.5)</span></pre></td></tr></table></div>

<p>The resulting diagram will look like this:</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game-BIG.png"><img src="http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game-BIG-300x150.png" alt="" title="binary.tree.for.binomial.game - BIG" width="300" height="150" class="alignnone size-medium wp-image-830" /></a></p>
<p>The function can also be tweaked in order to describe a more specific story.  For example, the following code describes a 3 stage Bernoulli process where an unfair coin is tossed 3 times (with probability of it giving heads being 0.8):</p>

<div class="wp_codebox"><table><tr id="p8298"><td class="line_numbers"><pre>1
2
3
4
5
6
</pre></td><td class="code" id="p829code8"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game_.r.txt&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># loading the function</span>
binary.<span style="">tree</span>.<span style="">for</span>.<span style="">binomial</span>.<span style="">game</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">3</span>, <span style="color: #ff0000;">0.8</span>, first_box_text <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Tossing an unfair coin&quot;</span>, <span style="color: #ff0000;">&quot;(3 times)&quot;</span><span style="color: #080;">&#41;</span>, left_branch_text <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Failure&quot;</span>, <span style="color: #ff0000;">&quot;Playing again&quot;</span><span style="color: #080;">&#41;</span>, right_branch_text <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Success&quot;</span>, <span style="color: #ff0000;">&quot;Playing again&quot;</span><span style="color: #080;">&#41;</span>, 
    left_leaf_text <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Failure&quot;</span>, <span style="color: #ff0000;">&quot;Game ends&quot;</span><span style="color: #080;">&#41;</span>, right_leaf_text <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Success&quot;</span>, 
        <span style="color: #ff0000;">&quot;Game ends&quot;</span><span style="color: #080;">&#41;</span>, cex <span style="color: #080;">=</span> <span style="color: #ff0000;">0.8</span>, rescale_radx <span style="color: #080;">=</span> <span style="color: #ff0000;">1.2</span>, rescale_rady <span style="color: #080;">=</span> <span style="color: #ff0000;">1.2</span>, 
    box_color <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;lightgrey&quot;</span>, shadow_color <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;darkgrey&quot;</span>, left_arrow_text <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Tails <span style="color: #000099; font-weight: bold;">\n</span>(P = 0.2)&quot;</span><span style="color: #080;">&#41;</span>, 
    right_arrow_text <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Heads <span style="color: #000099; font-weight: bold;">\n</span>(P = 0.8)&quot;</span><span style="color: #080;">&#41;</span>, distance_from_arrow <span style="color: #080;">=</span> <span style="color: #ff0000;">0.04</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>The resulting diagram is:</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game002.png"><img src="http://www.r-statistics.com/wp-content/uploads/2011/11/binary.tree_.for_.binomial.game002-300x257.png" alt="" title="binary.tree.for.binomial.game002" width="300" height="257" class="alignnone size-medium wp-image-833" /></a></p>
<p>If you make up neat examples of using the code (or happen to find a bug), or for any other reason &#8211; you are <strong>welcome to leave a comment</strong>.</p>
<p>(note: the images above are licensed under CC BY-SA)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2011/11/diagram-for-a-bernoulli-process-using-r/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Engineering Data Analysis (with R and ggplot2) &#8211; a Google Tech Talk given by Hadley Wickham</title>
		<link>http://www.r-statistics.com/2011/06/engineering-data-analysis-with-r-and-ggplot2-a-google-tech-talk-given-by-hadley-wickham/</link>
		<comments>http://www.r-statistics.com/2011/06/engineering-data-analysis-with-r-and-ggplot2-a-google-tech-talk-given-by-hadley-wickham/#comments</comments>
		<pubDate>Fri, 17 Jun 2011 08:30:48 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[R links]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[ggplot2 book]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[google tech talk]]></category>
		<category><![CDATA[Hadley Wickham]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=760</guid>
		<description><![CDATA[It appears that just days ago, Google Tech Talk released a new, one hour long, video of a presentation (from June 6, 2011) made by one of R&#8217;s community more influential contributors, Hadley Wickham. This seems to be one of the better talks to send a programmer friend who is interested in getting into R. [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2011/06/engineering-data-analysis-with-r-and-ggplot2-a-google-tech-talk-given-by-hadley-wickham/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2011/06/engineering-data-analysis-with-r-and-ggplot2-a-google-tech-talk-given-by-hadley-wickham/"></g:plusone></div></div><p><a href="http://www.r-statistics.com/wp-content/uploads/2011/06/YouTube-Engineering-Data-Analysis-with-R-and-ggplot2-Google-Chrome_2011-06-17_11-31-21.png"><img class="alignnone size-full wp-image-764" title="YouTube - Engineering Data Analysis (with R and ggplot2) - Google Chrome_2011-06-17_11-31-21" src="http://www.r-statistics.com/wp-content/uploads/2011/06/YouTube-Engineering-Data-Analysis-with-R-and-ggplot2-Google-Chrome_2011-06-17_11-31-21-e1308299835422.png" alt="" width="500" height="307" /></a></p>
<p>It appears that just days ago, Google Tech Talk released a new, one hour long, video of a presentation (from June 6, 2011) made by one of R&#8217;s community more influential contributors, <a href="http://had.co.nz/">Hadley Wickham</a>.</p>
<p>This seems to be one of the better talks to send a programmer friend who is interested in getting into <a href="http://www.r-project.org/">R</a>.</p>
<h3>Talk abstract</h3>
<p>Data analysis, the process of converting data into knowledge, insight and understanding, is a critical part of statistics, but there&#8217;s surprisingly little research on it. In this talk I&#8217;ll introduce some of my recent work, including a model of data analysis. I&#8217;m a passionate advocate of programming that data analysis should be carried out using a programming language, and I&#8217;ll justify this by discussing some of the requirement of good data analysis (reproducibility, automation and communication). With these in mind, I&#8217;ll introduce you to a powerful set of tools for better understanding data: the statistical programming language R, and the ggplot2 domain specific language (DSL) for visualisation.</p>
<h3>The video</h3>
<p><object width="500" height="306"><param name="movie" value="http://www.youtube.com/v/TaxJwC_MP9Q?version=3"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/TaxJwC_MP9Q?version=3" type="application/x-shockwave-flash" width="500" height="306" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<h3>More resources</h3>
<ul>
<li><a href="http://had.co.nz/">Hadley&#8217;s homepage</a></li>
<li><a href="http://hadley.github.com/">More talks/presentations by Hadley</a></li>
<li><a href="http://had.co.nz/ggplot2/book/">The ggplot2 book (sample chapters)</a></li>
<li><a href="http://cran.r-project.org/web/packages/ggplot2/index.html">GGplot2 on CRAN</a></li>
<li>Hat (link) tip goes to my good, <a href="http://productivewise.com/">social media, internet and productivity researcher</a>, friend Eyal Sela &#8211; for informing me about this talk.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2011/06/engineering-data-analysis-with-r-and-ggplot2-a-google-tech-talk-given-by-hadley-wickham/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Beeswarm Boxplot (and plotting it with R)</title>
		<link>http://www.r-statistics.com/2011/03/beeswarm-boxplot-and-plotting-it-with-r/</link>
		<comments>http://www.r-statistics.com/2011/03/beeswarm-boxplot-and-plotting-it-with-r/#comments</comments>
		<pubDate>Thu, 10 Mar 2011 07:40:49 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[Beeswarm]]></category>
		<category><![CDATA[boxplot]]></category>
		<category><![CDATA[charts]]></category>
		<category><![CDATA[graphs]]></category>
		<category><![CDATA[plots]]></category>
		<category><![CDATA[points]]></category>
		<category><![CDATA[scatterplot]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=655</guid>
		<description><![CDATA[(The image above is called a &#8220;Beeswarm Boxplot&#8221; , the code for producing this image is provided at the end of this post) The above plot is implemented under different names in different softwares. This &#8220;Scatter Dot Beeswarm Box Violin &#8211; plot&#8221; (in the lack of an agreed upon term) is a one-dimensional scatter plot [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2011/03/beeswarm-boxplot-and-plotting-it-with-r/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2011/03/beeswarm-boxplot-and-plotting-it-with-r/"></g:plusone></div></div><p><a href="http://www.r-statistics.com/wp-content/uploads/2011/03/Beeswarm-Boxplot2.png"><img src="http://www.r-statistics.com/wp-content/uploads/2011/03/Beeswarm-Boxplot2.png" alt="" title="Beeswarm-Boxplot" width="450" height="394" class="alignnone size-full wp-image-699" /></a></p>
<p>(The image above is called a &#8220;Beeswarm Boxplot&#8221; , the code for producing this image is provided at the end of this post)</p>
<p>The above plot is implemented under different names in different softwares.  This &#8220;Scatter Dot Beeswarm Box Violin &#8211; plot&#8221; (in the lack of an agreed upon term) is a one-dimensional scatter plot which is like &#8220;stripchart&#8221;, but with closely-packed, non-overlapping points; the positions of the points are corresponding to the frequency in a similar way as the violin-plot.  The plot can be superimposed with a boxplot to give a very rich description of the underlaying distribution.</p>
<p>This plot has been implemented in various statistical packages, in this post I will list the few I came by so far.  And if you know of an implementation I&#8217;ve missed please tell me about it in the comments.</p>
<p><span id="more-655"></span></p>
<h3>Implementations in commercial statistical packages</h3>
<p><a href="http://graphpad.com/help/prism5/prism5help.html?what_is_a_frequency_distribution.htm">GraphPad implements this graph</a> under the name &#8220;<strong>column scatter plot</strong>&#8221; (with line drawn at the mean) made from the &#8220;Frequency distribution&#8221; sample data.  So does <a href="http://www.originlab.com/www/products/GraphGallery.aspx?GID=104&#038;s=8&#038;lm=215">OriginLab</a><br />
(My thanks goes to <a href="http://stats.stackexchange.com/users/582/nico">nico</a> for finding this examples)</p>
<p>I imagine there is also something similar in the &#8220;big&#8221; packages (SAS, JMP, SPSS etc&#8230;), but I could not yet find an example.</p>
<h3>Implementations in Free Open-Source statistical packages</h3>
<p>I&#8217;ve noticed that <a href="http://www.ggobi.org/">GGobi </a>has a &#8220;texture&#8221; 1D plot, which is a very similar implementation of this plot.  But the main focus of this post will (expectedly) be R.</p>
<p>In the R web-ecosystem, several people have written and asked about this.<br />
In his blog &#8220;<a href="http://sas-and-r.blogspot.com">SAS and R</a>&#8220;, Ken Kleinman has wrote <a href="http://sas-and-r.blogspot.com/2010/10/example-810-combination-dotplotboxplot.html">about the creation of a dot-box-plot</a> about half a year ago.<br />
He wrapped his code and it can be run using the following command:</p>

<div class="wp_codebox"><table><tr id="p65514"><td class="line_numbers"><pre>1
2
3
4
</pre></td><td class="code" id="p655code14"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.math.smith.edu/sasr/examples/wild-helper.R&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># getting the boxplonts3 function</span>
ds <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">read.<span style="">csv</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.math.smith.edu/r/data/help.csv&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># getting some data</span>
female <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">subset</span><span style="color: #080;">&#40;</span>ds, female<span style="color: #080;">==</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">with</span><span style="color: #080;">&#40;</span>female,boxpoints3<span style="color: #080;">&#40;</span>pcs, homeless, <span style="color: #ff0000;">&quot;PCS&quot;</span>, <span style="color: #ff0000;">&quot;Homeless&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>  <span style="color: #228B22;"># plotting...</span></pre></td></tr></table></div>

<p>With the following pleasing output:<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2011/03/boxploints-3.png"><img src="http://www.r-statistics.com/wp-content/uploads/2011/03/boxploints-3.png" alt="" title="boxploints 3" width="540" height="400" class="aligncenter size-full wp-image-658" /></a><br />
In a <a href="http://sas-and-r.blogspot.com/2010/10/reader-suggestions-on-alternative-ways.html">followup post</a>, Ken posted of some suggestions he received from his readers on how to make the plot better (through other functions, and also on <a href="http://had.co.nz/ggplot2/">ggplot2</a> implementations)</p>
<p>In the <a href="http://r.789695.n4.nabble.com/A-plot-similar-to-violin-plot-td3341230.html">R help mailing list</a>, there was recently a question asked on this topic (which had led me to writing this post) asking for:</p>
<blockquote><p>A band of dots on the plot are the data point. The density of dots and the &#8220;fatness&#8221; of the band present the frequency of a particular value in Y-axis. This property is similar to the violin plot: showing the probability density of the data at different values. Instead of showing a shape in violin plot, this plot shows the actual distribution of the data points. </p></blockquote>
<p>Joshua Wiley had responded by pointing some <a href="http://joshuawiley.com/R.aspx">R code he had worked on</a>,  based on an algorithm from Leland Wilkinson.  However, it is not yet release ready and does not<br />
handle multiple groups (though that is on his todo list).<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2011/03/stacked-dot-plot.png"><img src="http://www.r-statistics.com/wp-content/uploads/2011/03/stacked-dot-plot.png" alt="" title="stacked dot plot" width="540" height="302" class="alignright size-full wp-image-659" /></a></p>
<p>Jim Lemon (the author of the wonderful <a href="http://cran.r-project.org/web/packages/plotrix/index.html">plotrix</a> R package) have also offered his solution to the problem:</p>

<div class="wp_codebox"><table><tr id="p65515"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
</pre></td><td class="code" id="p655code15"><pre class="rsplus" style="font-family:monospace;">x<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">list</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">runif</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">90</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">runif</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">runif</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">80</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
dendroPlot<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x,breaks<span style="color: #080;">=</span>NA,nudge<span style="color: #080;">=</span>NA<span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>
 <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">na</span></span><span style="color: #080;">&#40;</span>breaks<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
 breaks<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">seq</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">min</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">unlist</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span>,na.<span style="">rm</span><span style="color: #080;">=</span>TRUE<span style="color: #080;">&#41;</span>,
 <span style="color: #0000FF; font-weight: bold;">max</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">unlist</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span>,na.<span style="">rm</span><span style="color: #080;">=</span>TRUE<span style="color: #080;">&#41;</span>,length.<span style="">out</span><span style="color: #080;">=</span><span style="color: #ff0000;">10</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,<span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">+</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">range</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">unlist</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,type<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;n&quot;</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">na</span></span><span style="color: #080;">&#40;</span>nudge<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> nudge<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">strwidth</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;o&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #ff0000;">2</span>
 <span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span>list_element <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>
 binvar<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">cut</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#91;</span><span style="color: #080;">&#91;</span>list_element<span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span>,breaks<span style="color: #080;">=</span>breaks<span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span>bin <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">levels</span><span style="color: #080;">&#40;</span>binvar<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>
  thisbin<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">which</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span><span style="color: #080;">&#40;</span>binvar<span style="color: #080;">&#41;</span><span style="color: #080;">==</span>bin<span style="color: #080;">&#41;</span>
  offset<span style="color: #080;">&lt;-</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#91;</span><span style="color: #080;">&#91;</span>list_element<span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#91;</span>thisbin<span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span><span style="color: #080;">*</span>nudge
  <span style="color: #0000FF; font-weight: bold;">offset</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">seq</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">2</span>,<span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">offset</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">by</span><span style="color: #080;">=</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&lt;-</span>
   <span style="color: #080;">-</span><span style="color: #0000FF; font-weight: bold;">offset</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">seq</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">2</span>,<span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">offset</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">by</span><span style="color: #080;">=</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span>
  <span style="color: #0000FF; font-weight: bold;">points</span><span style="color: #080;">&#40;</span>list_element<span style="color: #080;">+</span><span style="color: #0000FF; font-weight: bold;">offset</span>,<span style="color: #0000FF; font-weight: bold;">sort</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#91;</span><span style="color: #080;">&#91;</span>list_element<span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#91;</span>thisbin<span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
 <span style="color: #080;">&#125;</span>
 <span style="color: #080;">&#125;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
dendroPlot<span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p><a href="http://www.r-statistics.com/wp-content/uploads/2011/03/dendroplot.png"><img src="http://www.r-statistics.com/wp-content/uploads/2011/03/dendroplot.png" alt="" title="dendroplot" width="540" height="400" class="alignright size-full wp-image-660" /></a></p>
<p>In asking about this plot (almost half a year ago) on <a href="http://stats.stackexchange.com/questions/2271/how-to-plot-a-violin-scatter-boxplot-in-r">CrossValidated</a>, I&#8217;ve been offered two wonderful answers.</p>
<p>The first one was by<a href="http://biomath.ugent.be/biomath/index.php"> Joris Meys</a> who wrote the following Make.Funny.Plot function (I ran it with rnorm(1000) and added an overlay of a boxplot)</p>

<div class="wp_codebox"><table><tr id="p65516"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
</pre></td><td class="code" id="p655code16"><pre class="rsplus" style="font-family:monospace;">&nbsp;
&nbsp;
Make.<span style="">Funny</span>.<span style="">Plot</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
    unique.<span style="">vals</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">unique</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
    N <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span>
    N.<span style="">val</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">min</span><span style="color: #080;">&#40;</span>N<span style="color: #080;">/</span><span style="color: #ff0000;">20</span>,unique.<span style="">vals</span><span style="color: #080;">&#41;</span>
&nbsp;
    <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>unique.<span style="">vals</span><span style="color: #080;">&gt;</span>N.<span style="">val</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
      x <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">ave</span><span style="color: #080;">&#40;</span>x,<span style="color: #0000FF; font-weight: bold;">cut</span><span style="color: #080;">&#40;</span>x,N.<span style="">val</span><span style="color: #080;">&#41;</span>,FUN<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">min</span><span style="color: #080;">&#41;</span>
      x <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">signif</span><span style="color: #080;">&#40;</span>x,<span style="color: #ff0000;">4</span><span style="color: #080;">&#41;</span>
    <span style="color: #080;">&#125;</span>
    <span style="color: #228B22;"># construct the outline of the plot</span>
    outline <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">as.<span style="">vector</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">table</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
    outline <span style="color: #080;">&lt;-</span> outline<span style="color: #080;">/</span><span style="color: #0000FF; font-weight: bold;">max</span><span style="color: #080;">&#40;</span>outline<span style="color: #080;">&#41;</span>
&nbsp;
    <span style="color: #228B22;"># determine some correction to make the V shape,</span>
    <span style="color: #228B22;"># based on the range</span>
    y.<span style="">corr</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">diff</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">range</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">*</span><span style="color: #ff0000;">0.05</span>
&nbsp;
    <span style="color: #228B22;"># Get the unique values</span>
    yval <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sort</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">unique</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
    <span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #080;">-</span><span style="color: #ff0000;">1</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">min</span><span style="color: #080;">&#40;</span>yval<span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">max</span><span style="color: #080;">&#40;</span>yval<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
        type<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;n&quot;</span>,xaxt<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;n&quot;</span>,xlab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
    <span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span>i <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>yval<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
        n <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sum</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">==</span>yval<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
        x.<span style="">plot</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">seq</span><span style="color: #080;">&#40;</span><span style="color: #080;">-</span>outline<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span>,outline<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span>,<span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">=</span>n<span style="color: #080;">&#41;</span>
        y.<span style="">plot</span> <span style="color: #080;">&lt;-</span> yval<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span><span style="color: #080;">+</span><span style="color: #0000FF; font-weight: bold;">abs</span><span style="color: #080;">&#40;</span>x.<span style="">plot</span><span style="color: #080;">&#41;</span><span style="color: #080;">*</span>y.<span style="">corr</span>
        <span style="color: #0000FF; font-weight: bold;">points</span><span style="color: #080;">&#40;</span>x.<span style="">plot</span>,y.<span style="">plot</span>,pch<span style="color: #080;">=</span><span style="color: #ff0000;">19</span>,cex<span style="color: #080;">=</span><span style="color: #ff0000;">0.5</span><span style="color: #080;">&#41;</span>
    <span style="color: #080;">&#125;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
x <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1000</span><span style="color: #080;">&#41;</span>
Make.<span style="">Funny</span>.<span style="">Plot</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">boxplot</span><span style="color: #080;">&#40;</span>x, add <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span>, at <span style="color: #080;">=</span> <span style="color: #ff0000;">0</span>,  <span style="color: #0000FF; font-weight: bold;">col</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;#0000ff22&quot;</span><span style="color: #080;">&#41;</span>  <span style="color: #228B22;"># my thanks goes to Greg Snow for the tip on the transparency colour (from 2007): https://stat.ethz.ch/pipermail/r-help/2007-October/142934.html</span></pre></td></tr></table></div>

<p>And here is the output: </p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2011/03/funny-scatter-boxplot.png"><img src="http://www.r-statistics.com/wp-content/uploads/2011/03/funny-scatter-boxplot.png" alt="" title="funny scatter boxplot" width="540" height="400" class="alignright size-full wp-image-661" /></a></p>
<p>Finally, I saved the best (IMHO) implementation to the last, which is the<a href="http://cran.r-project.org/web/packages/beeswarm/index.html"> beeswarm package</a>, it was written by <a href="http://www.cbs.dtu.dk/~eklund/beeswarm/">Aron Charles Eklund</a> and shows to be the most promising solution I came by so far.  From the help page:</p>
<blockquote><p>A bee swarm plot is a one-dimensional scatter plot similar to &#8220;stripchart&#8221;, except that would-be overlapping points are separated such that each is visible.</p></blockquote>
<p>This function seems to offer the most options for customization such as several methods for placing the points and controlling the characters and colors.  This function is intended to be mostly compatible with calls to stripchart or boxplot. Thus, code that works with these functions should work with beeswarm with minimal modification.</p>
<p>Here is an example for using the beeswarm function (many thanks goes to <a href="http://www.statalgo.com/">Shane </a>for <a href="http://stats.stackexchange.com/questions/2271/how-to-plot-a-violin-scatter-boxplot-in-r/3608#3608">writing about this solution</a>!)</p>

<div class="wp_codebox"><table><tr id="p65517"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
</pre></td><td class="code" id="p655code17"><pre class="rsplus" style="font-family:monospace;">&nbsp;
<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>beeswarm<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #0000FF; font-weight: bold;">install.<span style="">packages</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;beeswarm&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#40;</span>breast<span style="color: #080;">&#41;</span>
beeswarm<span style="color: #080;">&#40;</span>time_survival ~ event_survival, <span style="color: #0000FF; font-weight: bold;">data</span> <span style="color: #080;">=</span> breast,
    method <span style="color: #080;">=</span> <span style="color: #ff0000;">'swarm'</span>,
    pch <span style="color: #080;">=</span> <span style="color: #ff0000;">16</span>, pwcol <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span><span style="color: #080;">&#40;</span>ER<span style="color: #080;">&#41;</span>,
    xlab <span style="color: #080;">=</span> <span style="color: #ff0000;">''</span>, ylab <span style="color: #080;">=</span> <span style="color: #ff0000;">'Follow-up time (months)'</span>,
    <span style="color: #0000FF; font-weight: bold;">labels</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'Censored'</span>, <span style="color: #ff0000;">'Metastasis'</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">legend</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'topright'</span>, <span style="color: #0000FF; font-weight: bold;">legend</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">levels</span><span style="color: #080;">&#40;</span>breast$ER<span style="color: #080;">&#41;</span>,
    <span style="color: #0000FF; font-weight: bold;">title</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">'ER'</span>, pch <span style="color: #080;">=</span> <span style="color: #ff0000;">16</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>And the output is the following:<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2011/03/violin-scatter-boxplot.png"><img src="http://www.r-statistics.com/wp-content/uploads/2011/03/violin-scatter-boxplot.png" alt="" title="violin scatter boxplot" width="540" height="400" class="aligncenter size-full wp-image-657" /></a></p>
<p>In order to get the plot I presented in the beginning of the post, you&#8217;ll need to use a boxplot function after running the beeswarm:</p>

<div class="wp_codebox"><table><tr id="p65518"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="code" id="p655code18"><pre class="rsplus" style="font-family:monospace;">&nbsp;
<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>beeswarm<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #0000FF; font-weight: bold;">install.<span style="">packages</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;beeswarm&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#40;</span>breast<span style="color: #080;">&#41;</span>
&nbsp;
beeswarm<span style="color: #080;">&#40;</span>time_survival ~ event_survival, <span style="color: #0000FF; font-weight: bold;">data</span> <span style="color: #080;">=</span> breast,
method <span style="color: #080;">=</span> <span style="color: #ff0000;">'swarm'</span>,
pch <span style="color: #080;">=</span> <span style="color: #ff0000;">16</span>, pwcol <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span><span style="color: #080;">&#40;</span>ER<span style="color: #080;">&#41;</span>,
xlab <span style="color: #080;">=</span> <span style="color: #ff0000;">''</span>, ylab <span style="color: #080;">=</span> <span style="color: #ff0000;">'Follow-up time (months)'</span>,
<span style="color: #0000FF; font-weight: bold;">labels</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'Censored'</span>, <span style="color: #ff0000;">'Metastasis'</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #0000FF; font-weight: bold;">boxplot</span><span style="color: #080;">&#40;</span>time_survival ~ event_survival, <span style="color: #0000FF; font-weight: bold;">data</span> <span style="color: #080;">=</span> breast, add <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span>, <span style="color: #0000FF; font-weight: bold;">names</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;&quot;</span>,<span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">col</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;#0000ff22&quot;</span><span style="color: #080;">&#41;</span>  
<span style="color: #228B22;"># my thanks goes to Greg Snow for the tip on the transparency colour (from 2007): https://stat.ethz.ch/pipermail/r-help/2007-October/142934.html</span></pre></td></tr></table></div>

<p>I hope you found this post useful, if you know of more ways to make such a plot &#8211; please let me (and others) know about it in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2011/03/beeswarm-boxplot-and-plotting-it-with-r/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>How to label all the outliers in a boxplot</title>
		<link>http://www.r-statistics.com/2011/01/how-to-label-all-the-outliers-in-a-boxplot/</link>
		<comments>http://www.r-statistics.com/2011/01/how-to-label-all-the-outliers-in-a-boxplot/#comments</comments>
		<pubDate>Thu, 27 Jan 2011 15:27:02 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[box plot]]></category>
		<category><![CDATA[box plot analysis]]></category>
		<category><![CDATA[boxplot]]></category>
		<category><![CDATA[boxplot help]]></category>
		<category><![CDATA[boxplot outlier]]></category>
		<category><![CDATA[boxplot r]]></category>
		<category><![CDATA[legend]]></category>
		<category><![CDATA[normal distribution]]></category>
		<category><![CDATA[outlier]]></category>
		<category><![CDATA[outlier number]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=634</guid>
		<description><![CDATA[In this post I offer an alternative function for boxplot, which will enable you to label outlier observations while handling complex uses of boxplot.]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2011/01/how-to-label-all-the-outliers-in-a-boxplot/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2011/01/how-to-label-all-the-outliers-in-a-boxplot/"></g:plusone></div></div><p>In this post I present a function that helps to label outlier observations When plotting a boxplot using R.</p>
<p>An outlier is an observation that is numerically distant from the rest of the data.  When reviewing a boxplot, an outlier is defined as a data point that is located outside the fences (&#8220;whiskers&#8221;) of the boxplot (e.g: outside 1.5 times the  interquartile range above the upper quartile and bellow the lower quartile).</p>
<p>Identifying these points in R is very simply when dealing with only one boxplot and a few outliers.  That can easily be done using the &#8220;identify&#8221; function in R.  For example, running the code bellow will plot a boxplot of a hundred observation sampled from a normal distribution, and will then enable you to pick the outlier point and have it&#8217;s label (in this case, that number id) plotted beside the point:</p>

<div class="wp_codebox"><table><tr id="p63419"><td class="line_numbers"><pre>1
2
3
4
</pre></td><td class="code" id="p634code19"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">482</span><span style="color: #080;">&#41;</span>
y <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">boxplot</span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">identify</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>, y, <span style="color: #0000FF; font-weight: bold;">labels</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">seq_along</span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>However, this solution is <strong>not </strong>scalable when dealing with:</p>
<ul>
<li>Many outliers</li>
<li>Overlapping data-points,  and</li>
<li>Multiple boxplots in the same graphic window</li>
</ul>
<p>For such cases I recently wrote the function &#8220;boxplot.with.outlier.label&#8221; (which you can <a href='http://www.r-statistics.com/wp-content/uploads/2011/01/boxplot-with-outlier-label-r.txt'><strong>download from here</strong></a>).  This function will plot operates in a similar way as &#8220;boxplot&#8221; (formula) does, with the added option of defining &#8220;label_name&#8221;.  When outliers are presented, the function will then progress to mark all the outliers using the label_name variable.  This function can handle interaction terms and will also try to space the labels so that they won&#8217;t overlap (my thanks goes to Greg Snow for his function &#8220;spread.labs&#8221; from the {TeachingDemos} package,   and helpful comments in the R-help mailing list).</p>
<p>Here is some example code you can try out for yourself:</p>

<div class="wp_codebox"><table><tr id="p63420"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
</pre></td><td class="code" id="p634code20"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2011/01/boxplot-with-outlier-label-r.txt&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># Load the function</span>
<span style="color: #228B22;"># sample some points and labels for us:</span>
<span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">492</span><span style="color: #080;">&#41;</span>
y <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">2000</span><span style="color: #080;">&#41;</span>
x1 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sample</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">letters</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>, <span style="color: #ff0000;">2000</span>,<span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
x2 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sample</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">letters</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>, <span style="color: #ff0000;">2000</span>,<span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
lab_y <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sample</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">letters</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">4</span><span style="color: #080;">&#93;</span>, <span style="color: #ff0000;">2000</span>,<span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
<span style="color: #228B22;"># plot a boxplot with interactions:</span>
boxplot.<span style="">with</span>.<span style="">outlier</span>.<span style="">label</span><span style="color: #080;">&#40;</span>y~x2<span style="color: #080;">*</span>x1, lab_y<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>Here is the resulting graph:<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2011/01/boxplot-identifiyed-all-outlier-points.png"><img src="http://www.r-statistics.com/wp-content/uploads/2011/01/boxplot-identifiyed-all-outlier-points-300x215.png" alt="" title="boxplot - identifiyed all outlier points (boxplot with interaction)" width="300" height="215" class="aligncenter size-medium wp-image-636" /></a></p>
<p>You can also have a try and run the following code to see how it handles simpler cases:</p>

<div class="wp_codebox"><table><tr id="p63421"><td class="line_numbers"><pre>1
2
3
4
5
</pre></td><td class="code" id="p634code21"><pre class="rsplus" style="font-family:monospace;"><span style="color: #228B22;"># plot a boxplot without interactions:</span>
boxplot.<span style="">with</span>.<span style="">outlier</span>.<span style="">label</span><span style="color: #080;">&#40;</span>y~x1, lab_y, ylim <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #080;">-</span><span style="color: #ff0000;">5</span>,<span style="color: #ff0000;">5</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #228B22;"># plot a boxplot of y only</span>
boxplot.<span style="">with</span>.<span style="">outlier</span>.<span style="">label</span><span style="color: #080;">&#40;</span>y, lab_y, ylim <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #080;">-</span><span style="color: #ff0000;">5</span>,<span style="color: #ff0000;">5</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
boxplot.<span style="">with</span>.<span style="">outlier</span>.<span style="">label</span><span style="color: #080;">&#40;</span>y, lab_y, spread_text <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">F</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># here the labels will overlap (because I turned spread_text off)</span></pre></td></tr></table></div>

<p>Here is the output of the last example, showing how the plot looks when we allow for the text to overlap.</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2011/01/boxplot-with-one-group-and-identifiyed-outliers-allowing-label-overlap.png"><img src="http://www.r-statistics.com/wp-content/uploads/2011/01/boxplot-with-one-group-and-identifiyed-outliers-allowing-label-overlap-300x215.png" alt="" title="boxplot - with one group and identifiyed outliers (allowing label overlap)" width="300" height="215" class="aligncenter size-medium wp-image-638" /></a></p>
<p>Regarding package dependencies: notice that this function requires you to first install the packages {TeachingDemos} (by Greg Snow) and {plyr} (by <a href="http://had.co.nz/">Hadley Wickham</a>) </p>
<p><strong>Updates:</strong></p>
<ul>
<li>19.04.2011 &#8211; I&#8217;ve added support to the boxplot &#8220;names&#8221; and &#8220;at&#8221; parameters.</li>
<li>31.10.2011 &#8211; I&#8217;ve fixed <a href="http://stackoverflow.com/questions/7929542/boxplot-outlier-labeling-in-r/">a bug report</a> (my thanks goes to Josh O&#8217;Brien for the heads up).  There is now also support for two arguments allowing to easily change the distance of the labels/segments from the outliers.</li>
</ul>
<p>You are very much invited to leave your comments if you find a <strong>bug</strong>, think of ways to <strong>improve </strong>the function, or simply <strong>enjoyed</strong> it and would like to share it with me.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2011/01/how-to-label-all-the-outliers-in-a-boxplot/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
		</item>
		<item>
		<title>R GUI now offers interactive graphics &#8211; Deducer 0.4-2 connects with iplots</title>
		<link>http://www.r-statistics.com/2010/10/r-gui-now-offers-interactive-graphics-deducer-0-4-2-connects-with-iplots/</link>
		<comments>http://www.r-statistics.com/2010/10/r-gui-now-offers-interactive-graphics-deducer-0-4-2-connects-with-iplots/#comments</comments>
		<pubDate>Sun, 24 Oct 2010 09:09:58 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[deducer]]></category>
		<category><![CDATA[Ian Fwllows]]></category>
		<category><![CDATA[interactive graphics]]></category>
		<category><![CDATA[iplots]]></category>
		<category><![CDATA[JGR]]></category>
		<category><![CDATA[R GUI]]></category>
		<category><![CDATA[R packages]]></category>
		<category><![CDATA[rJava]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=577</guid>
		<description><![CDATA[Earlier today, Ian Fwllows has announced the release of Deducer 0.4-2 and DeducerExtras 1.2 to CRAN (I copy his announcement here): Deducer 0.4-2 contains a few bug fixes, and an interface to the iplots package. With the new iplots interface it is now possible to do interactive plots with Deducer. An introductory example screen cast [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/10/r-gui-now-offers-interactive-graphics-deducer-0-4-2-connects-with-iplots/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/10/r-gui-now-offers-interactive-graphics-deducer-0-4-2-connects-with-iplots/"></g:plusone></div></div><p>Earlier today, Ian Fwllows has announced the release of Deducer 0.4-2 and DeducerExtras 1.2 to CRAN (I copy his announcement here):<br />
<a href="http://www.r-statistics.com/2010/10/r-gui-now-offers-interactive-graphics-deducer-0-4-2-connects-with-iplots/deducer-with-iplots-24-10-2010-11-02-46/" rel="attachment wp-att-579"><img src="http://www.r-statistics.com/wp-content/uploads/2010/10/deducer-with-iplots-24-10-2010-11-02-46-300x139.png" alt="" title="deducer with iplots - 24-10-2010 11-02-46" width="300" height="139" class="alignright size-medium wp-image-579" /></a><br />
<a href="http://cran.r-project.org/web/packages/Deducer/index.html">Deducer 0.4-2 </a>contains a few bug fixes, and an interface to the <a href="http://rosuda.org/iPlots/iplots.html">iplots package</a>. With the new iplots interface it is now possible to do interactive plots with Deducer. An introductory example screen cast (by Ian) is available on the tube:</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/T6kOvlMaFCA?version=3"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/T6kOvlMaFCA?version=3" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>DeducerExtras 1.2 contains a few new dialogs including &#8216;load data from package&#8217;, and &#8216;t-test power&#8217;.</p>
<p>Additionally, a new Windows R/JGR/Deducer installer is available which installs R-2.12.0, JGR with it&#8217;s launcher, Deducer, DeducerExtras, and DeducerPlugInScaling. It is available on the Deducer website:</p>
<p>http://www.deducer.org/pmwiki/pmwiki.php?n=Main.WindowsInstallation</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/10/r-gui-now-offers-interactive-graphics-deducer-0-4-2-connects-with-iplots/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Rose plot using Deducers ggplot2 plot builder</title>
		<link>http://www.r-statistics.com/2010/08/rose-plot-using-deducers-ggplot2-plot-builder/</link>
		<comments>http://www.r-statistics.com/2010/08/rose-plot-using-deducers-ggplot2-plot-builder/#comments</comments>
		<pubDate>Mon, 16 Aug 2010 22:35:52 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[deducer]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[GUI]]></category>
		<category><![CDATA[Hadley Wickham]]></category>
		<category><![CDATA[Ian fellows]]></category>
		<category><![CDATA[interfaces]]></category>
		<category><![CDATA[plot builder]]></category>
		<category><![CDATA[R GUI]]></category>
		<category><![CDATA[SPSS]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[tutorials]]></category>
		<category><![CDATA[videos]]></category>
		<category><![CDATA[youtube]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=517</guid>
		<description><![CDATA[The (excellent!) LearnR blog had a post today about making a rose plot in ggplot2. Following today&#8217;s announcement, by Ian Fellows, regarding the release of the new version of Deducer (0.4) offering a strong support for ggplot2 using a GUI plot builder, Ian also sent an e-mail where he shows how to create a rose [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/08/rose-plot-using-deducers-ggplot2-plot-builder/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/08/rose-plot-using-deducers-ggplot2-plot-builder/"></g:plusone></div></div><p>The (excellent!) <a href="http://learnr.wordpress.com/2010/08/16/consultants-chart-in-ggplot2/">LearnR blog had a post today</a> about making a rose plot in<br />
<a href="http://had.co.nz/ggplot2/">ggplot2</a>.</p>
<p>Following today&#8217;s announcement, by <a href="http://www.deducer.org/pmwiki/index.php/">Ian Fellows</a>, regarding <a href="http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/">the release of the new version of Deducer (0.4)</a> offering a strong support for ggplot2 using a GUI plot builder,  Ian also sent an e-mail where he shows how to create a rose plot using the new ggplot2 GUI included in the latest version of Deducer.  After the template is made, the plot can be generated with 4 clicks of the mouse.</p>
<p>Here is a video tutorial (Ian published) to show how this can be used:</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/CHYATHLM5sY?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/CHYATHLM5sY?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>The generated template file is available at:<br />
<a href="http://neolab.stat.ucla.edu/cranstats/rose.ggtmpl">http://neolab.stat.ucla.edu/cranstats/rose.ggtmpl</a></p>
<p>I am excited about the work Ian is doing, and hope to see more people publish use cases with Deducer.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/08/rose-plot-using-deducers-ggplot2-plot-builder/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ggplot2 plot builder is now on CRAN! (through Deducer 0.4 GUI for R)</title>
		<link>http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/</link>
		<comments>http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/#comments</comments>
		<pubDate>Mon, 16 Aug 2010 18:53:03 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[deducer]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[google summer of code]]></category>
		<category><![CDATA[GUI]]></category>
		<category><![CDATA[Hadley Wickham]]></category>
		<category><![CDATA[Ian fellows]]></category>
		<category><![CDATA[interfaces]]></category>
		<category><![CDATA[plot builder]]></category>
		<category><![CDATA[R GUI]]></category>
		<category><![CDATA[SPSS]]></category>
		<category><![CDATA[tutorial]]></category>
		<category><![CDATA[tutorials]]></category>
		<category><![CDATA[videos]]></category>
		<category><![CDATA[youtube]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=507</guid>
		<description><![CDATA[Ian fellows, a hard working contributer to the R community (and a cool guy), has announced today the release of Deducer (0.4) to CRAN (scheduled to update in the next day or so). This major update also includes the release of a new plug-in package (DeducerExtras), containing additional dialogs and functionality. Following is the e-mail [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/"></g:plusone></div></div><p>Ian fellows, a hard working contributer to the R community (and a cool guy), has announced today the release of <a href="http://www.deducer.org/pmwiki/pmwiki.php?n=Main.DeducerManual">Deducer </a>(0.4) to <a href="http://cran.r-project.org/web/packages/Deducer/index.html">CRAN</a> (scheduled to update in the next day or so).<br />
This major update also includes the release of a new plug-in package (DeducerExtras), containing additional dialogs and functionality.</p>
<p>Following is the e-mail he sent out with all the details and demo videos.</p>
<p><span id="more-507"></span></p>
<h3>Deducer</h3>
<p>Deducer is designed to be a free easy to use alternative to proprietary data analysis software such as SPSS, JMP, and Minitab. It has a menu system to do common data manipulation and analysis tasks, and an excel-like spreadsheet in which to view and edit data frames. The goal of the project is two fold.</p>
<p>Provide an intuitive interface so that non-technical users can learn and perform analyses without programming getting in their way.<br />
Increase the efficiency of expert R users when performing common tasks by replacing hundreds of keystrokes with a few mouse clicks. Also, as much as possible the GUI should not get in their way if they just want to do some programming.<br />
Deducer is designed to be used with the Java based R console JGR, though it supports a number of other R environments (e.g. Windows RGUI and RTerm).</p>
<p>For those not familiar with Deducer, an online manual is available at: <a href="http://www.deducer.org/pmwiki/pmwiki.php?n=Main.DeducerManual">http://www.deducer.org/pmwiki/pmwiki.php?n=Main.DeducerManual</a></p>
<p>An introductory tour of Deducer (4.5 min):</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/iZ857h2j6wA?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/iZ857h2j6wA?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>There is also an &#8220;expert users introsuction&#8221; (8 min)</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/AjLToyuluSM?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/AjLToyuluSM?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<h3>ggplot2 Plot Builder</h3>
<p>The major change to Deducer is the inclusion of a new plotting GUI built on the ggplot2 package. This Google Summer of Code project provides an easy to use system to make anything from simple histograms, to custom publication ready graphics. Feel free to check out the video introduction:</p>
<p>Part 1 (6 min):</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/-Rym6Ucraes?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/-Rym6Ucraes?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Part 2 (6 min): </p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/k6elEgB3OCE?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/k6elEgB3OCE?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Additional videos:<br />
Templates (5 min):</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/ktdifzqbLW8?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/ktdifzqbLW8?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Extending the Builder (4 min):</p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/RsxOo0jx0II?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/RsxOo0jx0II?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<h3>Deducer Extras</h3>
<p>The DeducerExtras package is an add-on package containing a variety of additional analysis dialogs. These include:</p>
<ul>
<li>Distribution quantiles</li>
<li>Single/multiple sample proportion tests</li>
<li>Paired t-test, and wilcoxon signed rank test</li>
<li>Levene&#8217;s test and bartlett&#8217;s test</li>
<li>K-means clustering</li>
<li>Hierarchical clustering</li>
<li>Factor analysis</li>
<li>Multi-dimensional scaling</li>
</ul>
<p>Introduction to Deducer Extras (~2 min): </p>
<p><object width="500" height="400"><param name="movie" value="http://www.youtube.com/v/UCrhxB8tSJY?fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/UCrhxB8tSJY?fs=1" type="application/x-shockwave-flash" width="500" height="400" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<h3>Final thanks</h3>
<p>I would like to take this opportunity to thank the R community for choosing this project for a Google Summer of Code grant, and for the support and encouragement. In particular I would like to thank Hadley Wickham for mentoring the Plot Builder GUI, and Dirk Eddelbuettel for his organization of students and mentors.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/08/ggplot2-plot-builder-is-now-available-on-cran-through-deducer-0-4-gui-for-r/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>New versions for ggplot2 (0.8.8) and plyr (1.0) were released today</title>
		<link>http://www.r-statistics.com/2010/07/released-today-new-versions-for-ggplot2-0-8-8-and-plyr-1-0/</link>
		<comments>http://www.r-statistics.com/2010/07/released-today-new-versions-for-ggplot2-0-8-8-and-plyr-1-0/#comments</comments>
		<pubDate>Tue, 06 Jul 2010 07:32:11 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[Hadley Wickham]]></category>
		<category><![CDATA[news]]></category>
		<category><![CDATA[plyr]]></category>
		<category><![CDATA[update]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=459</guid>
		<description><![CDATA[As prolific as the CRAN website is of packages, there are several packages to R that succeeds in standing out for their wide spread use (and quality), Hadley Wickhams ggplot2 and plyr are two such packages. And today (through twitter) Hadley has updates the rest of us with the news: just released new versions of [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/07/released-today-new-versions-for-ggplot2-0-8-8-and-plyr-1-0/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/07/released-today-new-versions-for-ggplot2-0-8-8-and-plyr-1-0/"></g:plusone></div></div><p>As prolific as the CRAN website is of packages, there are several packages to R that succeeds in standing out for their wide spread use (and quality), <a href="http://had.co.nz/">Hadley Wickhams </a><a href="http://had.co.nz/ggplot2/">ggplot2 </a>and <a href="http://had.co.nz/plyr/">plyr </a>are two such packages.<br />
<img src="http://had.co.nz/plyr/pliers.jpg" alt="plyr image" /><br />
And today (<a href="http://twitter.com/hadleywickham/status/17814050267">through twitter</a>) Hadley has updates the rest of us with the news:</p>
<blockquote><p>just released new versions of plyr and ggplot2. source versions available on cran, compiled will follow soon #rstats</p></blockquote>
<p>Going to the CRAN website shows that plyr has gone through the most major update, with the last update (before the current one) taking place on 2009-06-23.  And now, over a year later, we are presented with plyr version 1, which includes New functions, New features some Bug fixes and a much anticipated Speed improvements.<br />
ggplot2, has made a tiny leap from version 0.8.7 to 0.8.8, and was previously last updated on 2010-03-03.</p>
<p>Me, and I am sure many R users are very thankful for the amazing work that Hadley Wickham is doing (both on his code, and with helping other useRs on the help lists).  So Hadley, <strong>thank you</strong>!</p>
<p>Here is the complete change-log list for both packages:<br />
<span id="more-459"></span></p>
<h3>plyr 1.0 (2010-07-02) &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;</h3>
<p>(taken from <a href="http://cran.r-project.org/web/packages/plyr/NEWS">the CRAN website</a>)<br />
<strong> New functions:</strong></p>
<p>* arrange, a new helper method for reordering a data frame.<br />
* count, a version of table that returns data frames immediately and that is<br />
much much faster for high-dimensional data.<br />
* desc makes it easy to sort any vector in descending order<br />
* join, works like merge but can be much faster and has a somewhat simpler<br />
syntax drawing from SQL terminology<br />
* rbind.fill.matrix is like rbind.fill but works for matrices, code<br />
contributed by C. Beleites</p>
<p><strong>Speed improvements</strong></p>
<p>* experimental immutable data frame (idata.frame) that vastly speeds up<br />
subsetting &#8211; for large datasets with large numbers of groups, this can yield<br />
10-fold speed ups. See examples in ?idata.frame to see how to use it.<br />
* rbind.fill rewritten again to increase speed and work with more data types<br />
* d*ply now much faster with nested groups</p>
<p><strong>New features:</strong></p>
<p>* d*ply now accepts NULL for splitting variables, indicating that the data<br />
should not be split<br />
* plyr no longer exports internal functions, many of which were causing<br />
clashes with other packages<br />
* rbind.fill now works with data frame columns that are lists or matrices<br />
* test suite ensures that plyr behaviour is correct and will remain correct<br />
as I make future improvements.</p>
<p><strong>Bug fixes:</strong></p>
<p>* **ply: if zero splits, empty list(), data.frame() or logical() returned,<br />
as appropriate for the output type<br />
* **ply: leaving .fun as NULL now always returns list<br />
(thanks to Stavros Macrakis for the bug report)<br />
* a*ply: labels now respect options(stringAsFactors)<br />
* each: scoping bug fixed, thanks to Yasuhisa Yoshida for the bug report<br />
* list_to_dataframe is more consistent when processing a single data frame<br />
* NAs preserved in more places<br />
* progress bars: guaranteed to terminate even if **ply prematurely terminates<br />
* progress bars: misspelling gives informative warning, instead of<br />
uninformative error<br />
* splitter_d: fixed ordering bug when .drop = FALSE</p>
<h3>ggplot2 0.8.8 (2010-07-02) &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</h3>
<p>(taken from <a href="http://cran.r-project.org/web/packages/ggplot2/NEWS">the CRAN website</a>)</p>
<p><strong>Bug fixes:</strong></p>
<p>* coord_equal finally works as expected (thanks to continued prompting from Jean-Olivier Irisson)<br />
* coord_equal renamed to coord_fixed to better represent capabilities<br />
* coord_polar and coord_polar: new munching system that uses distances (as defined by the coordinate system) to figure out how many pieces each segment should be broken in to (thanks to prompting from Jean-Olivier Irisson)<br />
* fix ordering bug in facet_wrap (thanks to bug report by Frank Davenport)<br />
* geom_errorh correctly responds to height parameter outside of aes<br />
* geom_hline and geom_vline will not impact legend when used for fixed intercepts<br />
* geom_hline/geom_vline: intercept values not set quite correctly which caused a problem in conjunction with transformed scales (reported by Seth Finnegan)<br />
* geom_line: can now stack lines again with position = &#8220;stack&#8221; (fixes #74)<br />
* geom_segment: arrows now preserved in non-Cartesian coordinate system (fixes #117)<br />
* geom_smooth now deals with missing values in the same way as geom_line (thanks to patch from Karsten Loesing)<br />
* guides: check all axis labels for expressions (reported by Benji Oswald)<br />
* guides: extra 0.5 line margin around legend (fixes #71)<br />
* guides: non-left legend positions now work once more (thanks to patch from Karsten Loesing)<br />
* label_bquote works with more expressions (factors now cast to characters, thanks to Baptiste Auguie for bug report)<br />
* scale_color: add missing US spellings<br />
* stat: panels with no non-missing values trigged errors with some statistics. (reported by Giovanni Dall&#8217;Olio)<br />
* stat: statistics now also respect layer parameter inherit.aes (thanks to bug report by Lorenzo Isella and investigation by Brian Diggs)<br />
* stat_bin no longer drops 0-count bins by default<br />
* stat_bin: fix small bug when dealing with single bin with NA position (reported by John Rauser)<br />
* stat_binhex: uses range of data from scales when computing binwidth so hexes are the same size in all facets (thanks to Nicholas Lewin-Koh for the bug report)<br />
* stat_qq has new dparam parameter for specifying distribution parameters (thanks to Yunfeng Zhang for the bug report)<br />
* stat_smooth now uses built-in confidence interval (with small sample correction) for linear models (thanks to suggestion by Ian Fellows)<br />
* sta</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/07/released-today-new-versions-for-ggplot2-0-8-8-and-plyr-1-0/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Visualization of regression coefficients (in R)</title>
		<link>http://www.r-statistics.com/2010/07/visualization-of-regression-coefficients-in-r/</link>
		<comments>http://www.r-statistics.com/2010/07/visualization-of-regression-coefficients-in-r/#comments</comments>
		<pubDate>Fri, 02 Jul 2010 19:46:56 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[coefficients]]></category>
		<category><![CDATA[Coefficients Visualization]]></category>
		<category><![CDATA[graph]]></category>
		<category><![CDATA[plot]]></category>
		<category><![CDATA[regression]]></category>
		<category><![CDATA[regression plot]]></category>
		<category><![CDATA[regression Visualization]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=435</guid>
		<description><![CDATA[Update (07.07.10): The function in this post has a more mature version in the &#8220;arm&#8221; package.  (more details are available at the end of this post.) Update (04.01.12): There is a new package called Coefplot that offers a more general solution for plotting coeffs. (more details are available at the end of this post.) * * * * Imagine [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/07/visualization-of-regression-coefficients-in-r/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/07/visualization-of-regression-coefficients-in-r/"></g:plusone></div></div><p><strong>Update (07.07.10)</strong>: The function in this post has a more mature version in the &#8220;arm&#8221; package.  <em>(more details are available at the end of this post.)</em></p>
<p><strong>Update (04.01.12)</strong>: There is a new package called <a href="http://cran.r-project.org/web/packages/coefplot/" target="_self">Coefplot</a> that offers a more general solution for plotting coeffs. <em>(more details are available at the end of this post.)</em><br />
* * * *</p>
<p>Imagine you want to give a presentation or report of your latest findings running some sort of regression analysis. How would you do it?</p>
<p>This was exactly the question Wincent Rong-gui HUANG has recently asked <a href="http://r.789695.n4.nabble.com/Visualization-of-coefficients-tt2276010.html#none">on the R mailing list</a>.</p>
<p>One person, Bernd Weiss, responded by linking to the chapter &#8220;<a href="http://tables2graphs.com/doku.php?id=04_regression_coefficients">Plotting Regression Coefficients</a>&#8221; on an interesting online book (I have never heard of before) called &#8220;<a href="http://tables2graphs.com/doku.php">Using Graphs Instead of Tables</a>&#8221; (I should add this link to the <a href="http://www.r-statistics.com/2009/10/free-statistics-e-books-for-download/">free statistics e-books list</a>&#8230;)</p>
<p>Letter in the conversation, <a href="http://statmath.wu.ac.at/~zeileis/">Achim Zeileis</a>, has surprised us (well, me) saying the following</p>
<blockquote><p>I&#8217;ve thought about adding a plot() method for the coeftest() function in the <a href="http://cran.r-project.org/web/packages/lmtest/index.html">&#8220;lmtest&#8221; package</a>. Essentially, it relies on a coef() and a vcov() method being available &#8211; <strong>and that a central limit theorem holds</strong>. For releasing it as a general function in the package the code is still too raw, but maybe it&#8217;s useful for someone on the list. Hence,<strong> I&#8217;ve included it below</strong>.</p></blockquote>
<p>(I allowed myself to add some <strong>bolds</strong> in the text)</p>
<p>So for the convenience of all of us, I uploaded Achim&#8217;s code in a file for easy access. Here is an example of how to use it:</p>

<div class="wp_codebox"><table><tr id="p43524"><td class="line_numbers"><pre>1
2
3
4
</pre></td><td class="code" id="p435code24"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2010/07/coefplot.r.txt&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Mroz&quot;</span>, package <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;car&quot;</span><span style="color: #080;">&#41;</span>
fm</pre></td></tr></table></div>

<p>Here is the resulting graph:<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/07/regression-coefficient-plot.png"><img class="alignright size-full wp-image-437" title="regression coefficient plot" src="http://www.r-statistics.com/wp-content/uploads/2010/07/regression-coefficient-plot.png" alt="" width="550" /></a></p>
<p>I hope Achim will get around to improve the function so he might think it worthy of joining his<a href="http://cran.r-project.org/web/packages/lmtest/index.html">&#8220;lmtest&#8221; package</a>. I am glad he shared his code for the rest of us to have something to work with in the meantime <img src='http://www.r-statistics.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>* * *</p>
<p><strong>Update (07.07.10)</strong>:<br />
Thanks to a comment by David Atkins, I found out there is a more mature version of this function (called <strong>coefplot</strong>) inside the {arm} package. This version offers many features, one of which is the ability to easily stack several confidence intervals one on top of the other.</p>
<p>It works for baysglm, glm, lm, polr objects and a default method is available which takes pre-computed coefficients and associated standard errors from any suitable model.</p>
<p><strong>Example:</strong><br />
(Notice that the Poisson model in comparison with the binomial models does not make much sense, but is enough to illustrate the use of the function)</p>

<div class="wp_codebox"><table><tr id="p43525"><td class="line_numbers"><pre>1
2
3
</pre></td><td class="code" id="p435code25"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">library</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;arm&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Mroz&quot;</span>, package <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;car&quot;</span><span style="color: #080;">&#41;</span>
M1</pre></td></tr></table></div>

<p>(hat tip goes to Allan Engelhardt for help improving the code, and for Achim Zeileis in extending and improving the narration for the example)</p>
<p><strong>Resulting plot </strong></p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/07/coeff-visualization-3.png"><img class="alignright size-full wp-image-471" title="coeff visualization 3" src="http://www.r-statistics.com/wp-content/uploads/2010/07/coeff-visualization-3.png" alt="" width="550" /></a></p>
<p>* * *<br />
Another method worth mentioning is the Nomogram, implemented by Frank Harrell&#8217;a {<a href="http://biostat.mc.vanderbilt.edu/wiki/Main/Rrms">rms} package</a>.</p>
<p>* * *</p>
<p><strong>Update (04.01.12)</strong>:</p>
<p>The package {<a href="http://cran.r-project.org/web/packages/coefplot/" target="_self">Coefplot</a>}, by Jared Lander, plots coefficients from lm and glm models as well as from models generated by RevoScaleR&#8217;s rxLinMod and rxLogit functions.  The package is built on top of ggplot2 graphics, you can see an example for its use <a href="http://blog.revolutionanalytics.com/2012/01/new-package-for-plotting-model-coefficients.html">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/07/visualization-of-regression-coefficients-in-r/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

