<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>R-statistics blog &#187; R</title>
	<atom:link href="http://www.r-statistics.com/tag/r/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.r-statistics.com</link>
	<description>Writing about statistics with R, and open source stuff (software, data, community)</description>
	<lastBuildDate>Thu, 29 Jul 2010 01:51:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Richard Stallman talk+Q&amp;A at the useR! 2010 conference (audio files attached)</title>
		<link>http://www.r-statistics.com/2010/07/richard-stallman-talkqa-at-the-user-2010-conference-audio-files-attached/</link>
		<comments>http://www.r-statistics.com/2010/07/richard-stallman-talkqa-at-the-user-2010-conference-audio-files-attached/#comments</comments>
		<pubDate>Mon, 26 Jul 2010 19:39:15 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[R community]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[copyleft]]></category>
		<category><![CDATA[free doftware]]></category>
		<category><![CDATA[GPL]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[software as service]]></category>
		<category><![CDATA[useR]]></category>
		<category><![CDATA[useR 2010]]></category>
		<category><![CDATA[useR2010]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=483</guid>
		<description><![CDATA[The current hosting provider of the files couldn&#8217;t handle the work load. I am now moving the file to a different (hopefully more robust) hosting solution. Please come back in an hour or so to download the files. The files are online again! (The audio files of the full talk by Richard Stallman are attached to the end of this post.) &#8212;&#8212;&#8212;&#8212;&#8212;&#8211; Last week I had the honor of attending the talk given by Richard Stallman, the last keynote speaker [...]]]></description>
			<content:encoded><![CDATA[<p><del datetime="2010-07-27T10:32:41+00:00">The current hosting provider of the files couldn&#8217;t handle the work load.<br />
I am now moving the file to a different (hopefully more robust) hosting solution.<br />
Please come back in an hour or so to download the files.</del><br />
The files are online again!<br />
(<strong>The audio files of the full talk by Richard Stallman are attached to <u><a href="http://www.r-statistics.com/2010/07/richard-stallman-talkqa-at-the-user-2010-conference-audio-files-attached/#more-483">the end of this post.</a></u></strong>)</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;</p>
<p>Last week I had the honor of attending the talk given by <a href="http://en.wikipedia.org/wiki/Richard_Stallman">Richard Stallman</a>, the last keynote speaker on the <a href="http://user2010.org/">useR 2010</a> conference.  In this post I will give a brief context for the talk, and then give the audio files of the talk, with some description of what was said in the talk.</p>
<h3>Context for the talk</h3>
<p><span style="text-decoration: underline;"><strong>Richard Stallman </strong></span>can be viewed as (one of) the fathers of free software (free as in speech, not as in beer).</p>
<p>He is the man who led the <a href="http://www.gnu.org/">GNU project</a> for the creation of a free (as in speech, not as in beer) operation systems on the basis of which GNU-Linux, with its numerous distributions, was created.<br />
Richard also developed a number of pieces of widely used software, including the original Emacs,[4] the GNU Compiler Collection,[5], the GNU Debugger[6], and many tools in the GNU Coreutils</p>
<p>Richard also initiated the free software movement and in October 1985 he also founded it&#8217;s formal foundation and co-founded the League for Programming Freedom in 1989.</p>
<p>Stallman pioneered the concept of &#8220;copyleft&#8221; and he is the main author of several copyleft licenses including the GNU General Public License, the most widely used free software license.</p>
<p>You can read about him in the wiki article titles &#8220;<a href="http://en.wikipedia.org/wiki/Richard_Stallman">Richard Stallman</a>&#8221;</p>
<p><span style="text-decoration: underline;"><strong>The useR 2010 conference</strong><strong> </strong></span>is an annual 4 days conference of the community of people using R.  <a href="http://www.r-project.org/">R</a> is a free open source software for data analysis and statistical computing (Here is a bit more about <a href="http://www.r-statistics.com/2009/03/what-is-r/">what is R</a>).</p>
<p>The conference this year was truly a wonderful experience for me.  I  had the pleasure of giving two talks (about which I will blog later this month), listened to numerous talks on the use of R, and had a chance to meet many (<strong>many</strong>) kind and interesting people.</p>
<h3>Richard Stallmans talk</h3>
<p>The talk took place on July 23rd 2010 at NIST U.S.  and was the concluding talk for the useR2010 conference.  The talk consisted of a two hour lecture followed by a half-hour question and answer session.</p>
<p>On a personal note, I was very impressed by Richards talk.  Richard is not a shy computer geek, but rather a serious leader and thinker trying to stir people to action.  His speech was a sermon on free software, the history of GNU-Linux, the various versions of GPL, and his own history involving them.</p>
<p>I believe this talk would be of interest to anyone who cares about social solidarity, free software, programming and the hope of a better world for all of us.</p>
<p>I am eager for your thoughts in the comments (but please keep a kind tone).</p>
<p><strong><span style="text-decoration: underline;">Here is Richard Stallmans  (2 hours) talk:</span></strong></p>
<p><span id="more-483"></span><br />
<a href="http://www.r-statistics.com/wp-content/uploads/podcasts/Richard%20Stallman%20speach%20at%20useR2010%20-%20Talk.ogg"><strong>Audio file to download &#8211; Richard Stallman talk at the useR! 2010 conference</strong> (~2 hours)</a><br />
<audio src="http://www.r-statistics.com/wp-content/uploads/podcasts/Richard%20Stallman%20speach%20at%20useR2010%20-%20Talk.ogg"></audio></p>
<p><strong><span style="text-decoration: underline;">The second part of the talk</span></strong> consisted of Richard Stallman answering the following questions:</p>
<ul>
<li>What are your thoughts about<strong> Data portability?</strong></li>
<li>What are your thoughts about <strong>FaceBook</strong>?</li>
<li>Isn&#8217;t it a problem that free software doesn&#8217;t create <strong>wealth</strong>?</li>
<li>What are your thoughts about <strong>innovation</strong>?</li>
<li>What are your thoughts about Software as service (a.k.a: <strong>cloud computing</strong>)?</li>
<li>How can we defend your open sourced software from &#8220;<strong>hackers</strong>&#8220;?</li>
<li>What are your thoughts about <strong>google</strong>s products and services?</li>
<li>What are your thoughts about the legality/ethically of people changing from<strong> GPL to closed-sourced</strong>?</li>
<li>How can a programmer be &#8220;<strong>compensated</strong>&#8221; for his contribution for a free &#8220;open source&#8221; software?</li>
<li>What are your thoughts about &#8220;free <strong>games</strong>&#8220;?</li>
<li>What are your thoughts about <strong>search</strong> results?</li>
<li>What are your thoughts about Taxes and <strong>government </strong>responsibility for the use of free software?</li>
</ul>
<p><a href="http://www.r-statistics.com/wp-content/uploads/podcasts/Richard%20Stallman%20speach%20at%20useR2010%20-%20QA.ogg"><strong>Audio file to download &#8211; Richard Stallman talk at the useR! 2010 conference &#8211; Q&#038;A session</strong> (~25 minutes)</a></p>
<p><audio src="http://www.r-statistics.com/wp-content/uploads/podcasts/Richard%20Stallman%20speach%20at%20useR2010%20-%20QA.ogg"></audio></p>
<p>Final note, more talks from the useR2010 conference are expected to be put online <a href="http://www.vcasmo.com/user/drewconway">here</a>, thanks to <a href="http://www.drewconway.com/zia/?p=2221">Drew Conway</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/07/richard-stallman-talkqa-at-the-user-2010-conference-audio-files-attached/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
<enclosure url="http://dl.dropbox.com/u/5371432/WebSites/R-statistics.com/audio/Richard%20Stallman%20speach%20at%20useR2010%20-%20Talk.mp3" length="118416194" type="audio/mpeg" />
<enclosure url="http://www.r-statistics.com/" length="0" type="Array" />
<enclosure url="http://dl.dropbox.com/u/5371432/WebSites/R-statistics.com/audio/Richard%20Stallman%20speach%20at%20useR2010%20-%20QA.mp3" length="24545906" type="audio/mpeg" />
		</item>
		<item>
		<title>Want to join the closed BETA of a new Statistical Analysis Q&amp;A site &#8211; NOW is the time!</title>
		<link>http://www.r-statistics.com/2010/07/want-to-join-the-closed-beta-of-a-new-statistical-analysis-qa-site-now-is-the-time/</link>
		<comments>http://www.r-statistics.com/2010/07/want-to-join-the-closed-beta-of-a-new-statistical-analysis-qa-site-now-is-the-time/#comments</comments>
		<pubDate>Fri, 16 Jul 2010 07:06:56 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[R community]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[communites]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[online]]></category>
		<category><![CDATA[Q&A]]></category>
		<category><![CDATA[statistical analysis]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=474</guid>
		<description><![CDATA[The bottom line of this post is for you to go to: Stack Exchange Q&#038;A site proposal: Statistical Analysis And commit yourself to using the website for asking and answering questions. (And also consider giving the contender, MetaOptimize a visit) * * * * Statistical analysis Q&#038;A website is about to go into BETA A month ago I invited readers of this blog to commit to using a new Q&#038;A website for Data-Analysis (based on StackOverFlow engine), once it will [...]]]></description>
			<content:encoded><![CDATA[<p><strong>The bottom line of this post is for you to go to:<br />
<a href="http://area51.stackexchange.com/proposals/33/statistical-analysis?referrer=3OUOcMUJcOo1">Stack Exchange Q&#038;A site proposal: Statistical Analysis </a><br />
And commit yourself to using the website for asking and answering questions.</strong></p>
<p>(And also consider giving the contender, <a href="http://metaoptimize.com/qa">MetaOptimize</a> a visit)</p>
<p>* * * * </p>
<h3>Statistical analysis Q&#038;A website is about to go into BETA</h3>
<p>A month ago I <a href="http://www.r-statistics.com/2010/06/a-new-qa-website-for-data-analysis-based-on-stackoverflow-engine-is-waiting-for-you/">invited readers of this blog to commit to using a new Q&#038;A website for Data-Analysis</a> (based on StackOverFlow engine), once it will open (the site was originally proposed by <a href="http://robjhyndman.com/researchtips/">Rob Hyndman</a>).<br />
And now, a month later, I am happy to write that <strong>over 500 people</strong> have shown interest in the website, and choose to commit themselves.  This means we we have reached 100% completion of the website proposal process, and in the next few days we will move to the next step.</p>
<p>The next step is that the website will go into closed BETA for about a week.  If you want to be part of this &#8211; now is <a href="http://area51.stackexchange.com/proposals/33/statistical-analysis?referrer=3OUOcMUJcOo1">the time to join</a> (<--- call for action people).<br />
From being part in some other closed BETA of similar projects, I can attest that the enthusiasm of the people trying to answer questions in the BETA is very impressive, so I strongly recommend the experience.</p>
<p>If you won't make it by the time you see this post, then no worries - about a week or so after the website will go online, it will be open to the wide public.</p>
<p>(p.s: thanks Romunov for pointing out to me that the BETA is about to open)</p>
<h3>p.s: MetaOptimize</h3>
<p>I would like to finish this post with mentioning <a href="http://metaoptimize.com/qa/">MetaOptimize</a>.   This is a Q&#038;A website which is of a more &#8220;machine learning&#8221; then a &#8220;statistical&#8221; community.  It also started out some short while ago, and already it has <a href="http://metaoptimize.com/qa/users/">around 700 users</a> who have submitted ~160 questions with ~520 answers given.  From my experience on the site so far, I have enjoyed the high quality of the questions and answers.<br />
When I first came by the website, I feared that supporting this website will split the R community of users between this website and the <a href="http://area51.stackexchange.com/proposals/33/statistical-analysis?referrer=3OUOcMUJcOo1">area 51 StackExchange website</a>.<br />
But after a lengthy discussion (<a href="http://www.r-statistics.com/2010/07/statistical-analysis-qa-website-did-stackoverflow-just-lose-it-to-metaoptimize-and-is-it-good-or-bad/">published recently as a post</a>) with MetaOptimize founder, Joseph Turian, I came to have a more optimistic view of the competition of the two websites.  Where at first I was afraid, I am now <strong>hopeful</strong> that each of the two website will manage to draw a tiny bit of different communities of people (that would otherwise wouldn&#8217;t be present in the other website) &#8211; thus offering all of us a wider variety of knowledge to tap into.</p>
<p>See you there&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/07/want-to-join-the-closed-beta-of-a-new-statistical-analysis-qa-site-now-is-the-time/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Visualization of regression coefficients (in R)</title>
		<link>http://www.r-statistics.com/2010/07/visualization-of-regression-coefficients-in-r/</link>
		<comments>http://www.r-statistics.com/2010/07/visualization-of-regression-coefficients-in-r/#comments</comments>
		<pubDate>Fri, 02 Jul 2010 19:46:56 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[coefficients]]></category>
		<category><![CDATA[Coefficients Visualization]]></category>
		<category><![CDATA[graph]]></category>
		<category><![CDATA[plot]]></category>
		<category><![CDATA[regression]]></category>
		<category><![CDATA[regression plot]]></category>
		<category><![CDATA[regression Visualization]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=435</guid>
		<description><![CDATA[Update (07.07.10): The function in this post has a more mature version in the &#8220;arm&#8221; package. See at the end of this post for more details. * * * * Imagine you want to give a presentation or report of your latest findings running some sort of regression analysis. How would you do it? This was exactly the question Wincent Rong-gui HUANG has recently asked on the R mailing list. One person, Bernd Weiss, responded by linking to the chapter [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Update (07.07.10)</strong>: The function in this post has a more mature version in the &#8220;arm&#8221; package.  See at the end of this post for more details.<br />
* * * *</p>
<p>Imagine you want to give a presentation or report of your latest findings running some sort of regression analysis.  How would you do it?</p>
<p>This was exactly the question Wincent Rong-gui HUANG has recently asked <a href="http://r.789695.n4.nabble.com/Visualization-of-coefficients-tt2276010.html#none">on the R mailing list</a>.</p>
<p>One person, Bernd Weiss, responded by linking to the chapter &#8220;<a href="http://tables2graphs.com/doku.php?id=04_regression_coefficients">Plotting Regression Coefficients</a>&#8221; on an interesting online book (I have never heard of before) called &#8220;<a href="http://tables2graphs.com/doku.php">Using Graphs Instead of Tables</a>&#8221; (I should add this link to the <a href="http://www.r-statistics.com/2009/10/free-statistics-e-books-for-download/">free statistics e-books list</a>&#8230;)</p>
<p>Letter in the conversation, <a href="http://statmath.wu.ac.at/~zeileis/">Achim Zeileis</a>, has surprised us (well, me) saying the following</p>
<blockquote><p>I&#8217;ve thought about adding a plot() method for the coeftest() function in the <a href="http://cran.r-project.org/web/packages/lmtest/index.html">&#8220;lmtest&#8221; package</a>. Essentially, it relies on a coef() and a vcov() method being available &#8211; <strong>and that a central limit theorem holds</strong>. For releasing it as a general function in the package the code is still too raw, but maybe it&#8217;s useful for someone on the list. Hence,<strong> I&#8217;ve included it below</strong>.</p></blockquote>
<p> (I allowed myself to add some <strong>bolds</strong> in the text)</p>
<p>So for the convenience of all of us, I uploaded Achim&#8217;s code in a file for easy access.  Here is an example of how to use it:</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2010/07/coefplot.r.txt&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Mroz&quot;</span>, package <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;car&quot;</span><span style="color: #080;">&#41;</span>
fm <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">glm</span><span style="color: #080;">&#40;</span>lfp ~ ., <span style="color: #0000FF; font-weight: bold;">data</span> <span style="color: #080;">=</span> Mroz, <span style="color: #0000FF; font-weight: bold;">family</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">binomial</span><span style="color: #080;">&#41;</span>
coefplot<span style="color: #080;">&#40;</span>fm, parm <span style="color: #080;">=</span> <span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span></pre></div></div>

<p>Here is the resulting graph:<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/07/regression-coefficient-plot.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/07/regression-coefficient-plot.png" alt="" title="regression coefficient plot" width="550" class="alignright size-full wp-image-437" /></a></p>
<p>I hope Achim will get around to improve the function so he might think it worthy of joining his<a href="http://cran.r-project.org/web/packages/lmtest/index.html">&#8220;lmtest&#8221; package</a>.  I am glad he shared his code for the rest of us to have something to work with in the meantime <img src='http://www.r-statistics.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>* * *</p>
<p><strong>Update (07.07.10)</strong>:<br />
Thanks to a comment by David Atkins, I found out there is a more mature version of this function (called <strong>coefplot</strong>) inside the {arm} package.  This version offers many features, one of which is the ability to easily stack several confidence intervals one on top of the other.</p>
<p>It works for baysglm, glm, lm, polr objects and a default method is available which takes pre-computed coefficients and associated standard errors from any suitable model.</p>
<p><strong>Example:</strong><br />
(Notice that the Poisson model in comparison with the binomial models does not make much sense, but is enough to illustrate the use of the function)</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">library</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;arm&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Mroz&quot;</span>, package <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;car&quot;</span><span style="color: #080;">&#41;</span>
M1<span style="color: #080;">&lt;-</span>      <span style="color: #0000FF; font-weight: bold;">glm</span><span style="color: #080;">&#40;</span>lfp ~ ., <span style="color: #0000FF; font-weight: bold;">data</span> <span style="color: #080;">=</span> Mroz, <span style="color: #0000FF; font-weight: bold;">family</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">binomial</span><span style="color: #080;">&#41;</span>
M2<span style="color: #080;">&lt;-</span> bayesglm<span style="color: #080;">&#40;</span>lfp ~ ., <span style="color: #0000FF; font-weight: bold;">data</span> <span style="color: #080;">=</span> Mroz, <span style="color: #0000FF; font-weight: bold;">family</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">binomial</span><span style="color: #080;">&#41;</span>
M3<span style="color: #080;">&lt;-</span>      <span style="color: #0000FF; font-weight: bold;">glm</span><span style="color: #080;">&#40;</span>lfp ~ ., <span style="color: #0000FF; font-weight: bold;">data</span> <span style="color: #080;">=</span> Mroz, <span style="color: #0000FF; font-weight: bold;">family</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">binomial</span><span style="color: #080;">&#40;</span>probit<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
coefplot<span style="color: #080;">&#40;</span>M2, xlim<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #080;">-</span><span style="color: #ff0000;">2</span>, <span style="color: #ff0000;">6</span><span style="color: #080;">&#41;</span>,            intercept<span style="color: #080;">=</span>TRUE<span style="color: #080;">&#41;</span>
coefplot<span style="color: #080;">&#40;</span>M1, add<span style="color: #080;">=</span>TRUE, col.<span style="">pts</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;red&quot;</span>,  intercept<span style="color: #080;">=</span>TRUE<span style="color: #080;">&#41;</span>
coefplot<span style="color: #080;">&#40;</span>M3, add<span style="color: #080;">=</span>TRUE, col.<span style="">pts</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;blue&quot;</span>, intercept<span style="color: #080;">=</span>TRUE, <span style="color: #0000FF; font-weight: bold;">offset</span><span style="color: #080;">=</span><span style="color: #ff0000;">0.2</span><span style="color: #080;">&#41;</span></pre></div></div>

<p>(hat tip goes to Allan Engelhardt for help improving the code, and for Achim Zeileis in extending and improving the narration for the example)</p>
<p><strong>Resulting plot </strong></p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/07/coeff-visualization-3.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/07/coeff-visualization-3.png" alt="" title="coeff visualization 3" width="550" class="alignright size-full wp-image-471" /></a></p>
<p>* * *<br />
Lastly,  another method worth mentioning is the Nomogram, implemented by Frank Harrell&#8217;a <a href="http://biostat.mc.vanderbilt.edu/wiki/Main/Rrms">rms package</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/07/visualization-of-regression-coefficients-in-r/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A new Q&amp;A website for Data-Analysis (based on StackOverFlow engine) &#8211; is waiting for you</title>
		<link>http://www.r-statistics.com/2010/06/a-new-qa-website-for-data-analysis-based-on-stackoverflow-engine-is-waiting-for-you/</link>
		<comments>http://www.r-statistics.com/2010/06/a-new-qa-website-for-data-analysis-based-on-stackoverflow-engine-is-waiting-for-you/#comments</comments>
		<pubDate>Thu, 17 Jun 2010 13:29:55 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[R community]]></category>
		<category><![CDATA[Q&A]]></category>
		<category><![CDATA[R comunity]]></category>
		<category><![CDATA[R Q&A]]></category>
		<category><![CDATA[stackoverflow]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=415</guid>
		<description><![CDATA[The bottom line of this post is for you to go to: Stack Exchange Q&#38;A site proposal: Statistical Analysis And commit yourself to using the website for asking and answering questions. 144 peoples already committed to using the website, we need 356 more&#8230; If you are looking for the reasons to do so &#8211; read on&#8230; What is the StackOverFlow Q&#38;A website about? StackOverFlow.com (&#8220;SO&#8221; for short) is a programming Q &#38; A site that&#8217;s free. Free to ask questions, [...]]]></description>
			<content:encoded><![CDATA[<p><strong>The bottom line of this post is for you to go to:<br />
<a href="http://bit.ly/aDuRKV">Stack Exchange Q&amp;A site proposal: Statistical Analysis </a><br />
And commit yourself to using the website for asking and answering questions. </strong>144 peoples already committed to using the website, we need 356 more&#8230; <img src='http://www.r-statistics.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /><br />
If you are looking for the reasons to do so &#8211; read on&#8230;</p>
<h3>What is the StackOverFlow Q&amp;A website about?</h3>
<p><a href="http://StackOverFlow.com">StackOverFlow.com</a> (&#8220;SO&#8221; for short) is a programming Q &amp; A site that&#8217;s free. Free to ask questions, free to answer questions, free to read. Free, And fast.</p>
<p>For the R community, SO offers a growing database of <a href="http://stackoverflow.com/questions/tagged/R">R related questions and answer</a> (click the link to check them out).</p>
<p>You might be asking yourself what&#8217;s so special about SO over other available resources such as <a href="http://www.r-project.org/mail.html">R mailing lists</a>, <a href="http://www.r-bloggers.com/">R blogs</a>,<a href="http://rwiki.sciviews.org/doku.php"> R wiki</a> and so on?<br />
That is a great question.<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/06/venn-diagram.png"><img class="alignnone size-full wp-image-416" title="venn-diagram" src="http://www.r-statistics.com/wp-content/uploads/2010/06/venn-diagram.png" alt="" width="440" height="431" /></a><br />
The answer is that SO succeeds in doing a great job synthesizing aspects of Wikis, Blogs, Forums, and Digg/Reddit to offer a very powerful Q&amp;A website.</p>
<p>In SO, the new questions are like forum/blog posts (A main text with comments/answers).  After someone answers a question, other users can give a thumb-up or a thumb-down to the answer (like digg/reddit).  And all content can be edited, like a wiki page, by the users (provided the user has enough &#8220;karma points&#8221;).<br />
You also get badges (&#8220;awards&#8221;) for a bunch of actions (like coming to the website every day for a month.  Giving an answer that got X amount of thumb-ups and so on).  The awards allows someone who is asking a question to see how much the person who had answered him has good reputation (in terms of acceptance/appreciation of his answers by other SO members).<br />
It also offers a small (but effective) ego-boost for the person who gives answers.</p>
<h3>So if StackOverFlow is so great &#8211; what is this new website you wrote about in the title?</h3>
<p>Well, StackOverFlow has one limitation.  It deals ONLY with programming questions.  Other questions like:</p>
<ul>
<li>Which of the following three graphics best displays this data set? Why?</li>
<li>Can you give an example of where I might prefer to use a z-test vs a t-test?</li>
<li>What is the relationship between Bayesian and neural networks?</li>
</ul>
<p>Will not be answered, and the threads will get closed as being &#8220;off topic&#8221;.  Why? because such questions are dealing with: statistics, data analysis, data mining, data visualization &#8211; But in no means in programming.</p>
<p>So there is no StackOverFlow-like Q&amp;A website for data analysis&#8230; Until now!</p>
<p>In the past few weeks,<a href="http://area51.stackexchange.com/users/14/rob-hyndman"> Rob Hyndman</a> and other users, have made much effort to push the creation of a new website, based on the StackOverFlow engine, to allow for statistically related Q&amp;A.<br />
His proposal for a new website is almost complete.  All it need is for you (yes you), to go to the following link:<br />
<a href="http://bit.ly/aDuRKV">Stack Exchange Q&amp;A site proposal: Statistical Analysis </a><br />
And commit yourself to the website (that is, click the button called &#8220;commit&#8221; &#8211; so to declare that you will have interest in reading, asking and answering questions on such a website)</p>
<p>Once a <del datetime="2010-06-18T04:54:51+00:00">few more tens</del> 379 more people will commit &#8211; the website will go online!</p>
<p>Hope to see you there.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/06/a-new-qa-website-for-data-analysis-based-on-stackoverflow-engine-is-waiting-for-you/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>Clustergram: visualization and diagnostics for cluster analysis (R code)</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/</link>
		<comments>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/#comments</comments>
		<pubDate>Tue, 15 Jun 2010 08:22:34 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[base graphics]]></category>
		<category><![CDATA[cluster analysis]]></category>
		<category><![CDATA[clustergram]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[Dendrogram]]></category>
		<category><![CDATA[diagnose]]></category>
		<category><![CDATA[diagnosing]]></category>
		<category><![CDATA[diagnostics]]></category>
		<category><![CDATA[functions]]></category>
		<category><![CDATA[ggplot]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[hierarchical clustering]]></category>
		<category><![CDATA[iris]]></category>
		<category><![CDATA[iris data set]]></category>
		<category><![CDATA[large data]]></category>
		<category><![CDATA[matlines]]></category>
		<category><![CDATA[non-hierarchical]]></category>
		<category><![CDATA[parallel coordinates]]></category>
		<category><![CDATA[R code]]></category>
		<category><![CDATA[R functions]]></category>
		<category><![CDATA[tree]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=391</guid>
		<description><![CDATA[About Clustergrams In 2002, Matthias Schonlau published in &#8220;The Stata Journal&#8221; an article named &#8220;The Clustergram: A graph for visualizing hierarchical and . As explained in the abstract: In hierarchical cluster analysis dendrogram graphs are used to visualize how clusters are formed. I propose an alternative graph named “clustergram” to examine how cluster members are assigned to clusters as the number of clusters increases. This graph is useful in exploratory analysis for non-hierarchical clustering algorithms like k-means and for hierarchical [...]]]></description>
			<content:encoded><![CDATA[<h3>About Clustergrams</h3>
<p>In 2002, <a href="http://www.schonlau.net/clustergram.html">Matthias Schonlau </a>published in &#8220;The Stata Journal&#8221; an article named &#8220;<a href="https://docs.google.com/viewer?url=http://www.schonlau.net/publication/02stata_clustergram.pdf">The Clustergram: A graph for visualizing hierarchical and </a>.  As explained in the abstract:</p>
<blockquote><p>In hierarchical cluster analysis dendrogram graphs are used to visualize how clusters are formed. I propose an alternative graph named “clustergram” to examine how cluster members are assigned to clusters as the number of clusters increases.<br />
This graph is useful in exploratory analysis for non-hierarchical clustering algorithms like k-means and for hierarchical cluster algorithms when the number of observations is large enough to make dendrograms impractical.</p></blockquote>
<p>A <a href="https://docs.google.com/viewer?url=http://www.schonlau.net/publication/04compstat_clustergram.pdf">similar article</a> was later written and was (maybe) published in &#8220;computational statistics&#8221;.</p>
<p>Both articles gives some nice background to known methods like k-means and methods for hierarchical clustering, and then goes on to present examples of using these methods (with the Clustergarm) to analyse some datasets.</p>
<p>Personally, I understand the clustergram to be a type of parallel coordinates plot where each observation is given a vector.  The vector contains the observation&#8217;s location according to how many clusters the dataset was split into.  The scale of the vector is the scale of the first principal component of the data. </p>
<h3>Clustergram in R (a basic function)</h3>
<p>After finding out about this method of visualization, I was hunted by the curiosity to play with it a bit.  Therefore, and since I didn&#8217;t find any implementation of the graph in R, I went about writing the code to implement it.</p>
<p>The code only works for kmeans, but it shows how such a plot can be produced, and could be later modified so to offer methods that will connect with different clustering algorithms.</p>
<p>The function I present here gets a data.frame/matrix with a row for each observation, and the variable dimensions present in the columns.<br />
The function assumes the data is scaled.<br />
The function then goes about calculating the cluster centers for our data, for varying number of clusters.<br />
For each cluster iteration, the cluster centers are multiplied by the first loading of the principal components of the original data.  Thus offering a weighted mean of the each cluster center dimensions that might give a decent representation of that cluster (this method has the known limitations of using the first component of a PCA for dimensionality reduction, but I won&#8217;t go into that in this post).<br />
Finally all of our data points are ordered according to their respective cluster first component, and plotted against the number of clusters (thus creating the clustergram).</p>
<p>My thank goes to <a href="http://had.co.nz/">Hadley Wickham</a> for offering some good tips on how to prepare the graph.</p>
<p>Here is the code (example follows)</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;">&nbsp;
&nbsp;
clustergram.<span style="">kmeans</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>Data, k, ...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
	<span style="color: #228B22;"># this is the type of function that the clustergram</span>
	<span style="color: #228B22;"># 	function takes for the clustering.</span>
	<span style="color: #228B22;"># 	using similar structure will allow implementation of different clustering algorithms</span>
&nbsp;
	<span style="color: #228B22;">#	It returns a list with two elements:</span>
	<span style="color: #228B22;">#	cluster = a vector of length of n (the number of subjects/items)</span>
	<span style="color: #228B22;">#				indicating to which cluster each item belongs.</span>
	<span style="color: #228B22;">#	centers = a k dimensional vector.  Each element is 1 number that represent that cluster</span>
	<span style="color: #228B22;">#				In our case, we are using the weighted mean of the cluster dimensions by </span>
	<span style="color: #228B22;">#				Using the first component (loading) of the PCA of the Data.</span>
&nbsp;
	cl <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">kmeans</span><span style="color: #080;">&#40;</span>Data, k,...<span style="color: #080;">&#41;</span>
&nbsp;
	cluster <span style="color: #080;">&lt;-</span> cl$cluster
	centers <span style="color: #080;">&lt;-</span> cl$centers <span style="color: #080;">%*%</span> <span style="color: #0000FF; font-weight: bold;">princomp</span><span style="color: #080;">&#40;</span>Data<span style="color: #080;">&#41;</span>$loadings<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>	<span style="color: #228B22;"># 1 number per center</span>
												<span style="color: #228B22;"># here we are using the weighted mean for each</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">list</span><span style="color: #080;">&#40;</span>
				cluster <span style="color: #080;">=</span> cluster,
				centers <span style="color: #080;">=</span> centers
			<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>		
&nbsp;
clustergram.<span style="">plot</span>.<span style="">matlines</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>X,Y, k.<span style="">range</span>, 
											x.<span style="">range</span>, y.<span style="">range</span> , COL, 
											add.<span style="">center</span>.<span style="">points</span> , centers.<span style="">points</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;white&quot;</span>, xlim <span style="color: #080;">=</span> x.<span style="">range</span>, ylim <span style="color: #080;">=</span> y.<span style="">range</span>,
			axes <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">F</span>,
			xlab <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;Number of clusters (k)&quot;</span>, ylab <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;PCA weighted Mean of the clusters&quot;</span>, main <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;Clustergram of the PCA-weighted Mean of the clusters k-mean clusters vs number of clusters (k)&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">axis</span><span style="color: #080;">&#40;</span>side <span style="color: #080;">=</span><span style="color: #ff0000;">1</span>, at <span style="color: #080;">=</span> k.<span style="">range</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">axis</span><span style="color: #080;">&#40;</span>side <span style="color: #080;">=</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">abline</span><span style="color: #080;">&#40;</span>v <span style="color: #080;">=</span> k.<span style="">range</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;grey&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
		<span style="color: #0000FF; font-weight: bold;">matlines</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">t</span><span style="color: #080;">&#40;</span>X<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">t</span><span style="color: #080;">&#40;</span>Y<span style="color: #080;">&#41;</span>, pch <span style="color: #080;">=</span> <span style="color: #ff0000;">19</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> COL, lty <span style="color: #080;">=</span> <span style="color: #ff0000;">1</span>, lwd <span style="color: #080;">=</span> <span style="color: #ff0000;">1.5</span><span style="color: #080;">&#41;</span>
&nbsp;
		<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>add.<span style="">center</span>.<span style="">points</span><span style="color: #080;">&#41;</span>
		<span style="color: #080;">&#123;</span>
			<span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>plyr<span style="color: #080;">&#41;</span>
&nbsp;
			xx <span style="color: #080;">&lt;-</span> ldply<span style="color: #080;">&#40;</span>centers.<span style="">points</span>, <span style="color: #0000FF; font-weight: bold;">rbind</span><span style="color: #080;">&#41;</span>
			<span style="color: #0000FF; font-weight: bold;">points</span><span style="color: #080;">&#40;</span>xx$y~xx$x, pch <span style="color: #080;">=</span> <span style="color: #ff0000;">19</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;red&quot;</span>, cex <span style="color: #080;">=</span> <span style="color: #ff0000;">1.3</span><span style="color: #080;">&#41;</span>
&nbsp;
			<span style="color: #228B22;"># add points	</span>
			<span style="color: #228B22;"># temp &lt;- l_ply(centers.points, function(xx) {</span>
									<span style="color: #228B22;"># with(xx,points(y~x, pch = 19, col = &quot;red&quot;, cex = 1.3))</span>
									<span style="color: #228B22;"># points(xx$y~xx$x, pch = 19, col = &quot;red&quot;, cex = 1.3)</span>
									<span style="color: #228B22;"># return(1)</span>
									<span style="color: #228B22;"># })</span>
						<span style="color: #228B22;"># We assign the lapply to a variable (temp) only to suppress the lapply &quot;NULL&quot; output</span>
		<span style="color: #080;">&#125;</span>	
	<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
&nbsp;
clustergram <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>Data, k.<span style="">range</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">:</span><span style="color: #ff0000;">10</span> , 
							clustering.<span style="">function</span> <span style="color: #080;">=</span> clustergram.<span style="">kmeans</span>,
							clustergram.<span style="">plot</span> <span style="color: #080;">=</span> clustergram.<span style="">plot</span>.<span style="">matlines</span>, 
							line.<span style="">width</span> <span style="color: #080;">=</span> .004, add.<span style="">center</span>.<span style="">points</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
	<span style="color: #228B22;"># Data - should be a scales matrix.  Where each column belongs to a different dimension of the observations</span>
	<span style="color: #228B22;"># k.range - is a vector with the number of clusters to plot the clustergram for</span>
	<span style="color: #228B22;"># clustering.function - this is not really used, but offers a bases to later extend the function to other algorithms </span>
	<span style="color: #228B22;">#			Although that would  more work on the code</span>
	<span style="color: #228B22;"># line.width - is the amount to lift each line in the plot so they won't superimpose eachother</span>
	<span style="color: #228B22;"># add.center.points - just assures that we want to plot points of the cluster means</span>
&nbsp;
	n <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">dim</span><span style="color: #080;">&#40;</span>Data<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
&nbsp;
	PCA.1 <span style="color: #080;">&lt;-</span> Data <span style="color: #080;">%*%</span> <span style="color: #0000FF; font-weight: bold;">princomp</span><span style="color: #080;">&#40;</span>Data<span style="color: #080;">&#41;</span>$loadings<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>	<span style="color: #228B22;"># first principal component of our data</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>colorspace<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>
			COL <span style="color: #080;">&lt;-</span> heat_hcl<span style="color: #080;">&#40;</span>n<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span>PCA.1<span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span>	<span style="color: #228B22;"># line colors</span>
		<span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>
			COL <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rainbow</span><span style="color: #080;">&#40;</span>n<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span>PCA.1<span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span>	<span style="color: #228B22;"># line colors</span>
			<span style="color: #0000FF; font-weight: bold;">warning</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'Please consider installing the package &quot;colorspace&quot; for prittier colors'</span><span style="color: #080;">&#41;</span>
		<span style="color: #080;">&#125;</span>
&nbsp;
	line.<span style="">width</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span>line.<span style="">width</span>, n<span style="color: #080;">&#41;</span>
&nbsp;
	Y <span style="color: #080;">&lt;-</span> NULL	<span style="color: #228B22;"># Y matrix</span>
	X <span style="color: #080;">&lt;-</span> NULL	<span style="color: #228B22;"># X matrix</span>
&nbsp;
	centers.<span style="">points</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">list</span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span>k <span style="color: #0000FF; font-weight: bold;">in</span> k.<span style="">range</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		k.<span style="">clusters</span> <span style="color: #080;">&lt;-</span> clustering.<span style="">function</span><span style="color: #080;">&#40;</span>Data, k<span style="color: #080;">&#41;</span>
&nbsp;
		clusters.<span style="">vec</span> <span style="color: #080;">&lt;-</span> k.<span style="">clusters</span>$cluster
			<span style="color: #228B22;"># the.centers &lt;- apply(cl$centers,1, mean)</span>
		the.<span style="">centers</span> <span style="color: #080;">&lt;-</span> k.<span style="">clusters</span>$centers 
&nbsp;
		noise <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">unlist</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">tapply</span><span style="color: #080;">&#40;</span>line.<span style="">width</span>, clusters.<span style="">vec</span>, <span style="color: #0000FF; font-weight: bold;">cumsum</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">seq_along</span><span style="color: #080;">&#40;</span>clusters.<span style="">vec</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span>clusters.<span style="">vec</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span>	
		<span style="color: #228B22;"># noise &lt;- noise - mean(range(noise))</span>
		y <span style="color: #080;">&lt;-</span> the.<span style="">centers</span><span style="color: #080;">&#91;</span>clusters.<span style="">vec</span><span style="color: #080;">&#93;</span> <span style="color: #080;">+</span> noise
		Y <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span>Y, y<span style="color: #080;">&#41;</span>
		x <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span>k, <span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
		X <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span>X, x<span style="color: #080;">&#41;</span>
&nbsp;
		centers.<span style="">points</span><span style="color: #080;">&#91;</span><span style="color: #080;">&#91;</span>k<span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>y <span style="color: #080;">=</span> the.<span style="">centers</span> , x <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span>k , k<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>	
	<span style="color: #228B22;">#	points(the.centers ~ rep(k , k), pch = 19, col = &quot;red&quot;, cex = 1.5)</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
	x.<span style="">range</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">range</span><span style="color: #080;">&#40;</span>k.<span style="">range</span><span style="color: #080;">&#41;</span>
	y.<span style="">range</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">range</span><span style="color: #080;">&#40;</span>PCA.1<span style="color: #080;">&#41;</span>
&nbsp;
	clustergram.<span style="">plot</span><span style="color: #080;">&#40;</span>X,Y, k.<span style="">range</span>, 
											x.<span style="">range</span>, y.<span style="">range</span> , COL, 
											add.<span style="">center</span>.<span style="">points</span> , centers.<span style="">points</span><span style="color: #080;">&#41;</span>
&nbsp;
&nbsp;
<span style="color: #080;">&#125;</span></pre></div></div>

<h3>Example on the iris dataset</h3>
<p>The<a href="http://en.wikipedia.org/wiki/Iris_flower_data_set"> iris data set</a> is a favorite example of many <a href="http://www.r-bloggers.com/?s=iris">R bloggers </a> when writing about <a href="http://opendatagroup.com/2009/10/21/r-accessors-explained/">R accessors </a>, <a href="http://learnr.wordpress.com/2009/10/06/export-data-frames-to-multi-worksheet-excel-file/">Data Exporting</a>, <a href="http://yihui.name/en/2009/09/how-to-import-ms-excel-data-into-r/">Data importing</a>, and for <a href="http://weitaiyun.blogspot.com/2009/03/unison-graph-and-parallel-coordinate.html">different </a><a href="http://weitaiyun.blogspot.com/2009/03/scatterplots.html">visualization </a>techniques.<br />
So it seemed only natural to experiment on it here.</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">iris</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">250</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>cex.<span style="">lab</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.5</span>, cex.<span style="">main</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.2</span><span style="color: #080;">&#41;</span>
Data <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">scale</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">iris</span><span style="color: #080;">&#91;</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">5</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notice I am scaling the vectors)</span>
clustergram<span style="color: #080;">&#40;</span>Data, k.<span style="">range</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">:</span><span style="color: #ff0000;">8</span>, line.<span style="">width</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.004</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notice how I am using line.width.  Play with it on your problem, according to the scale of Y.</span></pre></div></div>

<p>Here is the output:<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-1.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-1.png" alt="" title="clustergram 1" width="500"></a></p>
<p>Looking at the image we can notice a few interesting things.  We notice that one of the clusters formed (the lower one) stays as is no matter how many clusters we are allowing (except for one observation that goes way and then beck).<br />
We can also see that the second split is a solid one (in the sense that it splits the first cluster into two clusters which are not &#8220;close&#8221; to each other, and that about half the observations goes to each of the new clusters).<br />
And then notice how moving to 5 clusters makes almost no difference.<br />
Lastly, notice how when going for 8 clusters, we are practically left with 4 clusters (remember &#8211; this is according the mean of cluster centers by the loading of the first component of the PCA on the data)</p>
<p>If I where to take something from this graph, I would say I have a strong tendency to use 3-4 clusters on this data.</p>
<p>But wait, did our clustering algorithm do a stable job?<br />
Let&#8217;s try running the algorithm 6 more times (each run will have a different starting point for the clusters)</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">500</span><span style="color: #080;">&#41;</span>
Data <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">scale</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">iris</span><span style="color: #080;">&#91;</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">5</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notice I am scaling the vectors)</span>
<span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>cex.<span style="">lab</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.2</span>, cex.<span style="">main</span> <span style="color: #080;">=</span> .7<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>mfrow <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">3</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span>i <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">6</span><span style="color: #080;">&#41;</span> clustergram<span style="color: #080;">&#40;</span>Data, k.<span style="">range</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">:</span><span style="color: #ff0000;">8</span> , line.<span style="">width</span> <span style="color: #080;">=</span> .004, add.<span style="">center</span>.<span style="">points</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span></pre></div></div>

<p>Resulting with:  (press the image to enlarge it)<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-6.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-6.png" alt="" title="clustergram 6" width="500"></a><br />
Repeating the analysis offers even more insights.<br />
First, it would appear that until 3 clusters, the algorithm gives rather stable results.<br />
From 4 onwards we get various outcomes at each iteration.<br />
At some of the cases, we got 3 clusters when we asked for 4 or even 5 clusters.</p>
<p>Reviewing the new plots, I would prefer to go with the 3 clusters option.  Noting how the two &#8220;upper&#8221; clusters might have similar properties while the lower cluster is quite distinct from the other two.</p>
<p>By the way, the Iris data set is composed of three types of flowers.  I imagine the kmeans  had done a decent job in distinguishing the three.</p>
<h3>Limitation of the method (and a possible way to overcome it?!)</h3>
<p>It is worth noting that the current way the algorithm is built has a fundamental limitation:  The plot is good for detecting a situation where there are several clusters but each of them is clearly &#8220;bigger&#8221; then the one before it (on the first principal component of the data).</p>
<p>For example, let&#8217;s create a dataset with 3 clusters, each one is taken from a normal distribution with a higher mean:</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">250</span><span style="color: #080;">&#41;</span>
Data <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rbind</span><span style="color: #080;">&#40;</span>
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">2</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">2</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">2</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
				<span style="color: #080;">&#41;</span>				
clustergram<span style="color: #080;">&#40;</span>Data, k.<span style="">range</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">:</span><span style="color: #ff0000;">5</span> , line.<span style="">width</span> <span style="color: #080;">=</span> .004, add.<span style="">center</span>.<span style="">points</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span></pre></div></div>

<p>The resulting plot for this is the following:<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/06/Clustergram-3-ordered-clusters.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/06/Clustergram-3-ordered-clusters.png" alt="" title="Clustergram-3-ordered-clusters" width="500" class="alignnone size-full wp-image-402" /></a><br />
The image shows a clear distinction between three ranks of clusters.  There is no doubt (for me) from looking at this image, that three clusters would be the correct number of clusters.</p>
<p>But what if the clusters where different but didn&#8217;t have an ordering to them?<br />
For example, look at the following 4 dimensional data:</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">250</span><span style="color: #080;">&#41;</span>
Data <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rbind</span><span style="color: #080;">&#40;</span>
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
				<span style="color: #080;">&#41;</span>				
clustergram<span style="color: #080;">&#40;</span>Data, k.<span style="">range</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">:</span><span style="color: #ff0000;">8</span> , line.<span style="">width</span> <span style="color: #080;">=</span> .004, add.<span style="">center</span>.<span style="">points</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span></pre></div></div>

<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/06/Clustergram-4-UNordered-clusters.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/06/Clustergram-4-UNordered-clusters.png" alt="" title="Clustergram-4-UNordered-clusters" width="500" class="alignnone size-full wp-image-403" /></a></p>
<p>In this situation, it is not clear from the location of the clusters on the Y axis that we are dealing with 4 clusters.<br />
But what is interesting, is that through the growing number of clusters, we can notice that there are 4 &#8220;strands&#8221; of data points moving more or less together (until we reached 4 clusters, at which point the clusters started breaking up).<br />
Another hope for handling this might be using the color of the lines in some way, but I haven&#8217;t yet figured out how.</p>
<h3>Clustergram with ggplot2</h3>
<p><a href="http://had.co.nz/">Hadley Wickham</a> has kindly played with recreating the clustergram using the ggplot2 engine.  You can see the result here:<br />
<a href="http://gist.github.com/439761">http://gist.github.com/439761</a><br />
And this is what he wrote about it in the comments:</p>
<blockquote><p>I’ve broken it down into three components:<br />
* run the clustering algorithm and get predictions (many_kmeans and all_hclust)<br />
* produce the data for the clustergram (clustergram)<br />
* plot it (plot.clustergram)<br />
I don’t think I have the logic behind the y-position adjustment quite right though.</p></blockquote>
<p>Here is an example of how it looks:<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-ggplot2-1.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-ggplot2-1.png" alt="" title="clustergram-ggplot2-1" width="500" class="alignnone size-full wp-image-407" /></a></p>
<h3>Conclusions (some rules of thumb and questions for the future)</h3>
<p>In a first look, it would appear that the clustergram can be of use.  I can imagine using this graph to quickly run various clustering algorithms and then compare them to each other and review their stability (In the way I just demonstrated in the example above).</p>
<p>The three rules of thumb I have noticed by now are:</p>
<ol>
<li>Look at the location of the cluster points on the Y axis. See when they remain stable, when they start flying around, and what happens to them in higher number of clusters (do they re-group together)</li>
<li>Observe the strands of the datapoints.  Even if the clusters centers are not ordered, the lines for each item might (needs more research and thinking) tend to move together &#8211; hinting at the real number of clusters</li>
<li>Run the plot multiple times to observe the stability of the cluster formation (and location)</li>
</ol>
<p>Yet there is more work to be done and questions to seek answers to:</p>
<ul>
<li>The code needs to be extended to offer methods to various clustering algorithms.
</li>
<li>How can the colors of the lines be used better?
</li>
<li>How can this be done using other graphical engines (ggplot2/lattice?) &#8211; (<strong>Update</strong>: look at Hadley&#8217;s reply in the comments)
</li>
<li>What to do in case the first principal component doesn&#8217;t capture enough of the data? (maybe plot this graph to all the relevant components. but then &#8211; how do you make conclusions of it?)
</li>
<li>What other uses/conclusions can be made based on this graph?
</li>
</ul>
<p>I am looking forward to reading your input/ideas in the comments (or in reply posts).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>Could we run a statistical analysis on iPhone/iPad using R?</title>
		<link>http://www.r-statistics.com/2010/06/could-we-run-a-statistical-analysis-on-iphoneipad-using-r/</link>
		<comments>http://www.r-statistics.com/2010/06/could-we-run-a-statistical-analysis-on-iphoneipad-using-r/#comments</comments>
		<pubDate>Tue, 08 Jun 2010 10:42:57 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[iPad]]></category>
		<category><![CDATA[iPhone]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=381</guid>
		<description><![CDATA[Imprtent update to the post (17.07.10) I now came across David smith&#8217;s post on the REvolution blog, pointing to instruction on the R wiki for how to install R on the iPhone! I didn&#8217;t try it myself since it both requires jailbreaking the iPhone, and I don&#8217;t have an iPhone. But it is still interesting to know of. Preface &#8211; I don&#8217;t use Mac I don&#8217;t use Mac! Not that there is anything wrong with that, but I don&#8217;t use [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/06/iPhone-R.jpg"><img src="http://www.r-statistics.com/wp-content/uploads/2010/06/iPhone-R.jpg" alt="" title="iPhone-R" width="416" height="489" class="alignnone size-full wp-image-383" /></a></p>
<h3>Imprtent update to the post (17.07.10)</h3>
<p>I now came across David smith&#8217;s <a href="http://blog.revolutionanalytics.com/2010/04/r-on-the-iphone.html">post on the REvolution blog</a>, pointing to instruction on the R wiki for <a href="http://rwiki.sciviews.org/doku.php?id=getting-started:installation:iphone">how to install R on the iPhone</a>!<br />
I didn&#8217;t try it myself since it both requires jailbreaking the iPhone, and I don&#8217;t have an iPhone.  But it is still interesting to know of.</p>
<h3>Preface &#8211; I don&#8217;t use Mac</h3>
<p>I don&#8217;t use Mac! <a href="http://www.youtube.com/watch?v=9ild8w0rHQU">Not that there is anything wrong with that</a>, but I don&#8217;t use Mac&#8230;</p>
<p>Yet at the same time, wonderful people like<a href="http://www.danceinisrael.com/about/"> my wife</a>, my brother, my thesis advisor and even my mother-in-law &#8211; all use mac.  So one can&#8217;t help but wonder if I might be missing out on something.</p>
<p>Still, for a Windows user like me it is a bit difficult to understand the hype around the iPhone 4 release:<br />
<object width="640" height="385"><param name="movie" value="http://www.youtube.com/v/RVIxXBKesvg&#038;hl=en_US&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/RVIxXBKesvg&#038;hl=en_US&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="640" height="385"></embed></object><br />
Such releases tend to look to me more like this spoof video about <a href="http://www.youtube.com/watch?v=Xgls9IwWUyU">the release of the apple &#8220;i&#8221;</a>.</p>
<p>So while not using apples product, I have a deep respect for the impact it has made in peoples lives.  Which begs the question: Could you use R on an iPhone (or an iPad) ??</p>
<h3>Can R be run on iPhone/iPad ?</h3>
<p>This question (and the motivation for this post) was <a href="http://www.mail-archive.com/r-help@r-project.org/msg97811.html">raised in an R help mailing list thread a week ago</a>.</p>
<p>After receiving permission from the threads author, I am republishing the content that was presented there in the hopes it might be of interest to other R community members.</p>
<p><strong>And here is what &#8220;Marc Schwartz&#8221; wrote:</strong><br />
<span id="more-381"></span></p>
<h3>Marc Schwartz on R and iPhone/iPad </h3>
<p>Hi all,</p>
<p>There have been posts in the past about R being made available for the iPhone and perhaps more logically now, on the iPad. My recollection is that the hurdle discussed in the past was primarily a lack of access to a CLI on the iPhone&#8217;s variant of OSX, compelling the development of a GUI interface for R specifically for these devices. R itself, can be successfully compiled with the iPhone development tools.</p>
<p>Well, now there is another, clearly more profound reason.</p>
<p>The FSF has recently communicated with Apple on the presence of a GPL application (GNU Go) in the iTunes store because the iTunes TOS infringes upon the GPL. Apple, given a choice, elected to remove the application, rather than amending their TOS.</p>
<p>The FSF also informed the developers of the iPhone port of GNU Go that their distribution is in violation of the GPL. R Core and any others considering an iPhone/iPad port of R, if you are not already aware, take note&#8230;</p>
<p>More information is here:</p>
<p>http://www.fsf.org/news/2010-05-app-store-compliance/</p>
<p>with an update here:</p>
<p>http://www.fsf.org/news/blogs/licensing/more-about-the-app-store-gpl-enforcement</p>
<p>So, until Apple amends their TOS agreement, it looks like there will be no GPL apps available for the iPhone/iPad, since the only way to make applications available for these platforms is via the iTunes store (unless you unlock the device). Hence, no R for these devices in the foreseeable future.</p>
<h3>Marc Schwartz second massage</h3>
<p>Hi all,</p>
<p>Thanks to an offlist e-mail from Thomas (Lumley), I have spent the past few days getting a bit more edumacated on additional restrictions on the opportunity for R to appear on the iPhone/iPad. These in fact go well beyond the GPL issue that I raised in my initial post, which in and of itself, is not trivial by any means. I now know, or at least think I know, more about these issues than I probably wanted, but I also want to present a better picture of the situation.</p>
<p>Note that I am not a lawyer and am not intending to represent my findings from a legal perspective. I am reporting them here using a common sense approach from my own reading of the relevant Apple iPhone SDK language, as well as being based upon specific examples and discussions that I located on the web.</p>
<p>So, in summary, here are the key issues. I will follow each with some additional details below, but note that all such restrictions would need to be removed or otherwise overcome, before R could in fact appear on these two platforms, at least through Apple approved means in the App Store.</p>
<p>1. Distribution of GPL covered applications is not permissible via the App Store due to the Apple Terms of Service language, which infringes upon rights granted under the GPL.</p>
<p>&#8216;Nuff said.</p>
<p>2. The use of FORTRAN is precluded (explicitly from SDK version 4.x forward)</p>
<p>Version 3.x of the iPhone SDK has the following language:</p>
<p>3.3.1   Applications may only use Documented APIs in the manner prescribed by Apple and must not use or call any private APIs.</p>
<p>Apple of course does not offer a FORTRAN compiler, as those who build R from source on OSX, as I do, are keenly aware. Thanks to Simon, we have one to use for OSX, but one is not officially available for the iPhone/iPad in the SDK.</p>
<p>Note that there is an important distinction here. The ability to build R versus the ability to have the resultant application pass Apple&#8217;s review to be able to appear in the App Store.</p>
<p>The above language has also been interpreted to apply to Java/Flash, precluding those environments from appearing on the iPhone/iPad.</p>
<p>However, the beta release of the 4.x version of the iPhone SDK has the following language in the same section:</p>
<p>3.3.1 — Applications may only use Documented APIs in the manner prescribed by Apple and must not use or call any private APIs. Applications must be originally written in Objective-C, C, C++, or JavaScript as executed by the iPhone OS WebKit engine, and only code written in C, C++, and Objective-C may compile and directly link against the Documented APIs (e.g., Applications that link to Documented APIs through an intermediary translation or compatibility layer or tool are prohibited).</p>
<p>Thus, in fact, one can only use Objective-C, C, C++ or JavaScript to develop applications for the iPhone/iPad. No FORTRAN, upon which of course, R is dependent. The above language has also been interpreted to further reinforce restrictions specifically on Java/Flash.</p>
<p>3. The implementation of programming language interpreters, of which R is one, is precluded.</p>
<p>The following language appears in version 3.x of the iPhone SDK:</p>
<p>3.3.2   An Application may not itself install or launch other executable code by any means, including without limitation through the use of a plug-in architecture, calling other frameworks, other APIs or otherwise. No interpreted code may be downloaded or used in an Application except for code that is interpreted and run by Apple&#8217;s Documented APIs and built-in interpreter(s).</p>
<p>The above language has been (pardon the pun) interpreted to restrict the implementation of programming language interpreters on these devices. Some sites I found have narrowly focused on the &#8220;No interpreted code may be downloaded&#8221; part of the language as an &#8220;out&#8221; of sorts. However, upon review, they seem to ignore the &#8220;or used&#8221; wording, which would seem to be independent of the action of downloading. If the language was &#8220;and used&#8221;, then one could envision the use of locally stored or entered code, but this is not the case. In either case, of course, R would not be one of Apple&#8217;s &#8220;built-in&#8221; interpreters.</p>
<p>I found two interesting examples of how Apple has either approved or rejected applications that can be considered interpreters. It is not clear where Apple drew the line between these two, such that it might enable one to better differentiate the reasoning and therefore design an app that would pass muster with them. Thus, one complication may be Apple&#8217;s lack of internal consistency in making decisions on allowing or disallowing apps in the App Store.</p>
<p>The first is an application which was intended to be a BASIC interpreter on the iPhone and which was apparently rejected by Apple under the name BasicMatrix:</p>
<p>http://smartcalc.coollittlethings.com/?p=3</p>
<p>After the author substantially restricted the functionality of the application to being an &#8220;enhanced calculator&#8221; under the name &#8220;SmartCalc&#8221;, Apple approved the application.</p>
<p>However, a contrarian example is &#8220;Frotz&#8221; which is a game type of an application currently available at no cost in the App Store for both the iPhone and the iPad. It is in fact an interpreter of so-called &#8220;Z code&#8221; (http://en.wikipedia.org/wiki/Z-machine) to create interactive fiction adventures. The web page for the app is:</p>
<p>http://code.google.com/p/iphonefrotz/wiki/FrotzMain</p>
<p>The app was initially accepted by Apple. A later version however was rejected, because the updated version could download new game files (Z code files) via the internet. Once the author removed the download functionality, Apple accepted the updated version of the app.</p>
<p>Frotz is also a GPL&#8217;d application, so its longevity in the App Store is logically in question and I sent an e-mail to the author to be sure that he was aware of the FSF&#8217;s recent actions.</p>
<p>The additional implications for R here, vis-a-vis Frotz, would be the ability to download, install and use CRAN packages. I will touch on this issue again below.</p>
<p>4. The implementation of anything resembling the CRAN network, to facilitate add-on packages for R, would be highly problematic for multiple reasons.</p>
<p>The following language appears in version 3.x of the iPhone SDK</p>
<p>3.3.3   Without Apple’s prior written approval, an Application may not provide, unlock or enable additional features or functionality through distribution mechanisms other than the App Store.</p>
<p>For those who have an iPhone/iPad and are familiar with so-called &#8220;In App&#8221; purchases, this refers to the mechanism approved by Apple to provide add-on functionality to already installed applications. The notion of In App purchases is somewhat misleading, as in fact the activity may involve no actual direct monetary cost to download the additional component(s).</p>
<p>Thus, arguably between the SDK language, which would preclude a CRAN type network unless approved by Apple, combined with the relevant language I reference from 3.3.2 above, there would be substantive restrictions on the ability of a &#8220;default&#8221; R installation from being able to download and utilize add-on R packages.</p>
<p>Further, since one cannot compile programs on the iPhone/iPad, there would have to be some means to pre-compile any add-on packages for R on these two platforms, similar to what is done for Windows and OSX presently on CRAN to create package binaries. Further, packages that use FORTRAN, tcl/tk, Java, Perl or other such libraries would of course, also be problematic, both from a functional and where appropriate, a GPL perspective.</p>
<p>Even if one got by some of those issues and pre-compiled add-on packages were to be made available via some distribution process, the entire process of R package installation, updating and management would have to be re-written specifically for these platforms. Thus, there would be a meaningful amount of development work that some group of folks would have to undertake.</p>
<p>That all being said, it is clear that a remote client/server implementation of R would be possible, as long as the client-side R GUI application on the iPhone/iPad meets Apple&#8217;s requirements. Anyone using the WolframAlpha app ($1.99 U.S.) on the iPhone or iPad (I do on the former) will understand this model. See http://products.wolframalpha.com/iphone/ and http://products.wolframalpha.com/ipad/index.html for more information. Somebody just has to be willing to fund the backend server farm to provide the service.  <img src='http://www.r-statistics.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Needless to say, a fully browser based client/server implementation would also work and be cross-platform, provided that the input/output web pages are appropriately scaled for the mobile displays. WA also provides a free alternative to their dedicated apps via a mobile site at http://m.wolframalpha.com/ as an example. Similar considerations here, as above, for the backend server platform would be applicable. The key advantage of the dedicated app is a more efficient keyboard layout providing easier access to special character sets, rather than scrolling through them using the default keyboard.</p>
<p>I hope that the above is helpful to folks. Needless to say, I do not present the above as being the definitive reference, but it seems to be at least a logical interpretation of the current situation.</p>
<h3>Marc Schwartz&#8217;s reply to Ken Williams reply to Marc&#8217;s original post </h3>
<p>Ken,</p>
<p>See comments inline.</p>
<p>On Jun 1, 2010, at 2:25 PM, Ken Williams wrote:</p>
<p>> Hi Marc,<br />
><br />
> I want to debate a couple points from your post:<br />
><br />
>> 1. Distribution of GPL covered applications is not permissible via the App<br />
><br />
>> Store due to the Apple Terms of Service language, which infringes upon<br />
> rights<br />
>> granted under the GPL.<br />
>><br />
>> &#8216;Nuff said.<br />
><br />
> I&#8217;m not sure I agree with this, but there&#8217;s so much wiggle room in<br />
> interpreting what the GPL means that there would probably be no way to<br />
> decide without a courtroom &#038; judge, so I&#8217;ll leave this part alone. =)  I<br />
> also haven&#8217;t yet read your other post where you discuss this.</p>
<p>Please do, including the links therein. It&#8217;s not my interpretation, it is the FSF&#8217;s action and Apple&#8217;s response to that action, which sets at least an operational precedence, if not one that could also affect any future litigation pertaining to GPL&#8217;d apps in the App Store. That all just took place within the past week or so, which is what prompted my initial post on the matter, since it would be relevant to any R offering via that channel.</p>
<p>>> 3.3.1    Applications may only use Documented APIs in the manner<br />
> prescribed by<br />
>> Apple and must not use or call any private APIs.<br />
><br />
> I believe that language only refers to *Apple&#8217;s* APIs.  In other words, they<br />
> don&#8217;t want you to call hidden functions that aren&#8217;t supposed to be exposed<br />
> to developers.  If it meant no use of any APIs private to the developer, it<br />
> would rule out pretty much every application in existence, considering that<br />
> a call from one function to any other is an API call.</p>
<p>I am not in disagreement on that point. The key issue to date has been the lack of a compatible FORTRAN compiler, at least off the shelf, based upon what I can tell. Arguably, that is at least a notable deterrent to use FORTRAN on the iPhone for now.</p>
<p>There are other programming language tools for current iPhone development, but nothing for FORTRAN that I can find.</p>
<p>I can find no references to anyone building iPhone apps using FORTRAN (even in part) and the few queries that I can find that even mention an interest in doing so, reference the same issues that I have. Some have referenced f2c, but it is not clear to me that such an approach would work for R, not to mention the development overhead and the extensive testing of any conversion of critical functionality.</p>
<p>That all changes with the new language in 4.x.<br />
- Show quoted text -<br />
The key wording change relevant to R (and of course for other iPhone developers and tool providers) in 4.x is:</p>
<p>&#8220;Applications must be originally written in Objective-C, C, C++, or JavaScript as executed by the iPhone OS WebKit engine&#8221;</p>
<p>Ignore the other wording pertaining to API&#8217;s and other layers for the time being.</p>
<p>The app must be written natively in one of those four languages. There appears to be no interpretation that I can find that differentiates a scenario where a library of low level functions, written in a language such as FORTRAN, may be called from a higher level language such as those listed above.</p>
<p>Unless there is some subtlety in differentiating the abstraction layers within which the application is executed on the iPhone, I see no recourse here.</p>
<p>Note that the entities that provide iPhone cross-compilation/framework tools (eg. MonoTouch, Titanium, unity3D, Rhodes, etc.) which convert other code directly to native iPhone apps are also trying to figure out where they stand. Similarly, folks who develop natively in other languages are also having headaches over the new SDK wording.</p>
<p>There is even a question relevant to cross-compilation tools that take another language and convert it to, for example, Obj-C, as an intermediate step, before subsequent compilation to an iPhone native binary.</p>
<p>So the message that everyone is coming away with is, if you want to develop for the iPhone, write your code using one of these four languages, period. No doubt, some folks will test the boundaries and we will get more definitive answers in time.</p>
<p>Is it possible that the SDK language will change before 4.x is released as a stable OS/SDK? Sure, but that does not seem likely.</p>
<p>>> 3.3.2    An Application may not itself install or launch other executable<br />
> code<br />
>> by any means, including without limitation through the use of a plug-in<br />
>> architecture, calling other frameworks, other APIs or otherwise. No<br />
>> interpreted code may be downloaded or used in an Application except for<br />
> code<br />
>> that is interpreted and run by Apple&#8217;s Documented APIs and built-in<br />
>> interpreter(s).<br />
><br />
> I think this indeed pretty effectively rules out installation of packages<br />
> from CRAN, which is a bummer &#8211; unless those modules are downloaded &#038;<br />
> installed through the app store.  Not sure if that would even work though,<br />
> since they&#8217;re not apps.</p>
<p>As I note, between 3.3.2 and 3.3.3, any add-on functionality, such as CRAN packages, would be problematic any way you read it.</p>
<p>> As for the &#8220;interpreted code&#8221; stuff, there&#8217;s so much murkiness about what<br />
> constitutes interpreted code that I don&#8217;t know if this is a deal-breaker or<br />
> not.  At one extreme, it could prohibit pressing buttons in an app and then<br />
> &#8220;interpreting&#8221; those presses as commands for the app to &#8220;do something.&#8221;  At<br />
> the other extreme,  Somewhere in the middle, it would seem to cover language<br />
> translation apps.  The notion of &#8220;interpreted&#8221; is just not very<br />
> well-defined.  For instance, most people think of Perl as an interpreted<br />
> language, but it compiles to bytecode before executing just like Java (it<br />
> just doesn&#8217;t typically save it to a bytecode file).</p>
<p>I would say that, beyond the SDK language parsing issues relevant to interpreters, given that Apple rejected BasicMatrix and that there are no other programming language interpreters in the App Store, these are pretty goods sign that R would not pass Apple&#8217;s review under these parameters. I take a fairly pragmatic approach there.</p>
<p>> Finally, I do agree with the general tone implied in your post &#8211; it is a<br />
> major major hassle that Apple&#8217;s overlords control the distribution channel<br />
> for software on non-jailbroken iDevices.  I don&#8217;t like it at all, for the<br />
> exact reason that people like you &#038; me &#038; the rest of the world now have to<br />
> sit around speculating whether our helpful apps will pass muster with the<br />
> cabal.</p>
<p>As I noted in my closing comments in my second post, if one has a desire to make R&#8217;s functionality available on smartphones (iPhone, Android, etc.) or iPad-class devices, then a client/server approach may be the most efficient means to do so. That approach also avails you of more powerful computing platforms than the client side mobile devices have, at least at present, which will also limit aspects of portable functionality.</p>
<p>Regards,</p>
<p>Marc</p>
<h3>And lastly Gustaf Rydevik reply</h3>
<p>Indeed, the client/server approach is what is used in MatLab Mobile,<br />
which is now on sale in the app store.<br />
See</p>
<p>http://blogs.mathworks.com/desktop/2010/05/24/introducing-matlab-mobile-%E2%80%93-an-iphone-app-to-connect-remotely-to-your-matlab/</p>
<p>If matlab can do it, then surely the R community can as well.</p>
<p>Regards,<br />
Gustaf</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;</p>
<p>I hope the above will be interesting/useful to some of you in the future.<br />
Best,<br />
Tal</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/06/could-we-run-a-statistical-analysis-on-iphoneipad-using-r/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Helping the blind use R &#8211; by exporting R console to Word</title>
		<link>http://www.r-statistics.com/2010/05/helping-the-blind-use-r-by-exporting-r-console-to-word/</link>
		<comments>http://www.r-statistics.com/2010/05/helping-the-blind-use-r-by-exporting-r-console-to-word/#comments</comments>
		<pubDate>Sat, 22 May 2010 11:00:57 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[blind]]></category>
		<category><![CDATA[blind R]]></category>
		<category><![CDATA[JAWS]]></category>
		<category><![CDATA[R blind]]></category>
		<category><![CDATA[R2wd]]></category>
		<category><![CDATA[rcom]]></category>
		<category><![CDATA[sight-impaired]]></category>
		<category><![CDATA[sink()]]></category>
		<category><![CDATA[TeachingDemos]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=359</guid>
		<description><![CDATA[Preface &#8211; R seems a natural fit for the blind statistician For blind people who wish to do statistics, R can be ideal. R command line interface offers straight forward statistical scripting in the form of question (what is the mean of x) followed by an answer (0.2). That is, instead of point-and-click dialog boxes with jumping windows of results that GUI statistical systems offer. But there are still more hurdles to face before R can offer a perfect solution [...]]]></description>
			<content:encoded><![CDATA[<h3>Preface &#8211; R seems a natural fit for the blind statistician</h3>
<p>For blind people who wish to do statistics, R can be ideal.  R command line interface offers straight forward statistical scripting in the form of question (what is the mean of x) followed by an answer (0.2).  That is, instead of point-and-click dialog boxes with jumping windows of results that GUI statistical systems offer.</p>
<p>But there are still more hurdles to face before R can offer a perfect solution to the blind.<br />
In this post I would like to address just one such problem &#8211; reading R console output.</p>
<h3>Directing R console output to word &#8211; to allow blind people to easily navigate in it</h3>
<p>Recently, a question was posed in the R-help mailing list by a guy names Faiz, a blind new user of R.  Faiz wants to direct R output into word, to allow him to be able to read it.  Here is what he wrote:</p>
<blockquote><p>I would like to read the results of the commands type in the terminal window in Microsoft Word. As a blind user my options are somewhat limited and are time consuming if I want to see the results of the commands that I have type earlier. for example if my first two commands were<br />
 x<-c(1,2,3,4,5)<br />
mean(x)<br />
and I have typed ten more commands after the first two commands it is not easy for me to see that what was the result of mean(x)<br />
but if I can somehow divert the results of the commands to Microsoft Word it is comparatively easy for me to see what was the result of mean(x) and what were the results of other commands. One another advantage of diverting R's output to Microsoft Word for me is that from there they can be easily copied into assignments as well.
</p></blockquote>
<p>Faiz later elaborated more on his issue:</p>
<blockquote><p>I am using Windows XP, and using a screen reader called JAWS. When I type something at the console, I hear once what I have typed, and then the focus is on the next line. Then if I press the up arrow key I get to hear the function I just typed, not its output. For example if I type mean(x) and then I press enter I will hear &#8220;[5]&#8221; if it is the mean of x. Then I will hear &#8220;>&#8221;. Now if I want to find out what was the mean of x by pressing the<br />
up arrow key, I will only hear mean(x) and I will not hear [5].<br />
My screen reader does provide options to use different cursors to read command lines.<br />
but if I have typed median(x) sd(x) var(x) length(x) after typing mean(x), it takes a long time before I can move my cursor to the location where I can hear the mean of x. If the results of the commands can be diverted to MS Word it becomes comparatively easy for me to quickly move forward and backward in the document.</p>
<p>Any ideas and suggestions are appreciated.
</p></blockquote>
<p>Since recently I reviewed how one could <a href="http://www.r-statistics.com/2010/05/exporting-r-output-to-ms-word-with-r2wd-an-example-session/">export R output to MS-Word with R2wd</a>, It was only fitting to try and implement R2wd for this problem.<br />
I went looking on how to direct R console into a txt file, so I could later dump it into word.  I found that two commands gave me half of what I wanted.  sink() allows me to direct R output to a txt file, and savehistory() can save the command history into a txt file.  But I needed something that combines the two and captures all of R console output into a file.<br />
Failing to locate one, I turned to the R mailing list.  Among the kind people trying to help (Thank you David Winsemius, Bert Gunter and Duncan Murdoch) Greg Snow came through in supplying the help (not surprisingly&#8230;).<br />
Greg directed me to a function he wrote called txtStart() (from the TeachingDemos package), which operates in a similar way as sink(), only it also captures the R commands that where used &#8211; exactly what I was looking for!</p>
<p>Based on this, I devised two functions that can be used to redirect R output into word.</p>
<p>Here is how to use them:</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #228B22;"># Step 1: reading the functions needed for this task, from the file I uploaded to www.r-statistics.com</span>
<span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2010/05/R-console-to-word.r.txt&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #228B22;"># Example:</span>
<span style="color: #228B22;"># Step 2 - start capturing</span>
txtStart.2wd<span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># start capturing text.  If you are missing any packages - this function will prompt you to install them</span>
				<span style="color: #228B22;"># IF the installation fails - consider changing your mirror location</span>
<span style="color: #228B22;"># Step 3 - run R code</span>
	<span style="color: #0000FF; font-weight: bold;">date</span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>
	x <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">25</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">mean</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">var</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">Sys.<span style="">Date</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">Sys.<span style="">time</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>
<span style="color: #228B22;"># Step 4 - close connection - print output to word</span>
txtStop.2wd<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># This closes the capturing of the output.  And writes it into a new word file.</span>
				<span style="color: #228B22;"># if this is the first time in your session you are using this function, you should pass the</span>
				<span style="color: #228B22;"># &quot;T&quot; paramater to the function so it will open a new document and connect to it</span>
				<span style="color: #228B22;"># IF the doc is already open, the paramater should be &quot;F&quot;, as it will soon be demonstrated.</span>
&nbsp;
<span style="color: #228B22;"># Step 5, adding some more text to that doc file</span>
txtStart.2wd<span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># start capturing text.  </span>
<span style="color: #228B22;"># Code to run:</span>
<span style="color: #0000FF; font-weight: bold;">stem</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># stem offers a text alternative to a histogram </span>
txtStop.2wd<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">F</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># This closes the capturing of the output.  Notice we use &quot;F&quot; as paramater in the function - since we alread have an open doc file</span></pre></div></div>

<p>For me, this worked&#8230;</p>
<p>If you would like R to automatically run in the startup the code needed to get the two functions: txtStart.2wd and txtStop.2wd , you can run this in your R console: (once is enough)</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #228B22;"># Start of code</span>
Rprofile.<span style="">site</span>.<span style="">loc</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">R.<span style="">home</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\\</span>etc<span style="color: #000099; font-weight: bold;">\\</span>Rprofile.site&quot;</span>, sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'<span style="color: #000099; font-weight: bold;">\n</span>'</span>, <span style="color: #ff0000;">'source(&quot;http://www.r-statistics.com/wp-content/uploads/2010/05/R-console-to-word.r.txt&quot;)'</span>, <span style="color: #ff0000;">'<span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: #080;">&#41;</span> ,  <span style="color: #228B22;"># You could also put source() with a link to a local copy of the source code.</span>
<span style="color: #0000FF; font-weight: bold;">file</span> <span style="color: #080;">=</span> Rprofile.<span style="">site</span>.<span style="">loc</span>, <span style="color: #0000FF; font-weight: bold;">append</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
<span style="color: #228B22;"># End of code</span></pre></div></div>

<h3>Bringing R to the blind: there is much more work a head!</h3>
<p>Until this point, it didn&#8217;t cross my mind to ask how can R be used by the blind.  But once  this question was raised &#8211; it brings with it many more questions.<br />
Can R be adjusted to easily be read by known aids to sight impaired people? (I am sure Linux users here will have much to add)<br />
Can people in the community think of writing function to turn R output into a more easily read text for the blind?<br />
For example &#8211; the summary() command is wonderful for me.  But I am trying to imagine how it would look like in the &#8220;eyes&#8221; of a person who can&#8217;t see.  Surly there could be some way to turn the wide summary format into a long format.<br />
Perhaps there is room for a more general approach to the question of how to help blind people to be able to use R.<br />
And is there a need?  How many blind people choose to pursue studying statistics (or disciplines for which they would need to know statistics/R)?<br />
I hope to read your thoughts on the matter.</p>
<p>On a personal note:  My father was on the verge of blindness, prior to his cataract surgery.  I saw first hand how the life of the sight-impaired can look like.  Giving people in that situation help is a great MITZVA (a.k.a: &#8220;good deed&#8221; in Hebrew).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/05/helping-the-blind-use-r-by-exporting-r-console-to-word/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Exporting R output to MS-Word with R2wd (an example session)</title>
		<link>http://www.r-statistics.com/2010/05/exporting-r-output-to-ms-word-with-r2wd-an-example-session/</link>
		<comments>http://www.r-statistics.com/2010/05/exporting-r-output-to-ms-word-with-r2wd-an-example-session/#comments</comments>
		<pubDate>Thu, 06 May 2010 15:20:05 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[R output]]></category>
		<category><![CDATA[R report]]></category>
		<category><![CDATA[report]]></category>
		<category><![CDATA[Word]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=336</guid>
		<description><![CDATA[Creating reports is one of the basic tasks in data analysis. R provides numerous functions and packages to export it&#8217;s (beautiful) output and help compile it into a report. In this post I will present one such (basic) solution for Windows OS users for exporting R output into Microsoft Word using the R2wd (package). There are more ways and strategies for doing this, and if encouraged by comments, I will gladly write more on the subject. * * * R [...]]]></description>
			<content:encoded><![CDATA[<p>Creating reports is one of the basic tasks in data analysis.  R provides numerous functions and packages to export it&#8217;s (beautiful) output and help compile it into a report.</p>
<p>In this post I will present one such (basic) solution for Windows OS users for <strong>exporting R output into Microsoft Word</strong> using the <a href="http://cran.r-project.org/web/packages/R2wd/index.html">R2wd</a> (package).  There are more ways and strategies for doing this, and if encouraged by comments, I will gladly write more on the subject.<br />
*  *  *</p>
<h3>R to Word using {R2wd}</h3>
<p>The package R2wd (available through <a href="http://cran.r-project.org/web/packages/R2wd/index.html">CRAN</a>) relies on <a href="http://cran.r-project.org/web/packages/rcom/index.html">rcom</a>.  It is a wrapper that uses the <a href="http://rcom.univie.ac.at/download.html">statconnDCOM </a>server to communicate with MS-Word via the COM interface.</p>
<p>R2wd can perform the basic tasks you would expect to need when creating a report from R. It allows you to:</p>
<ul>
<li>Create a new Word file</li>
<li>Create headers and sub-headers</li>
<li>Move to a new pages in the document</li>
<li>Write text</li>
<li>Insert tables (that is &#8220;data.frame&#8221; and &#8220;matrix&#8221;objects)</li>
<li>Insert plots</li>
<li>Save and close the Word document</li>
<li>&#8230;(and more)</li>
</ul>
<p>The current R2wd can still be seen as being in BETA stages.  Some features are not yet available, such as:</p>
<ul>
<li>Choosing text font (which means most of us will need to manually change the font in the document to &#8220;couriers new&#8230;&#8221;, in order for the formatting to look good)</li>
<li>Inserting of complex object outputs (such as summery.lm, although in the example bellow I show how that can be achieved using a simple function)</li>
<li>Speed &#8211; the speed of inserting a table is somewhat slow, I am not sure how it would scale to large documents</li>
</ul>
<p>But from a (pleasant) correspondence with the package developer, I was assured the next release will supply us with more options and features.</p>
<p>R2wd package developer, Christan Ritter, invites feedback from users.  So if you have features you are missing in this packages, I believe he would like to know about it (you can e-mail Christan at:     christian.ritter &lt;-at-&gt; ridaco &lt;-dot-&gt; be  )</p>
<h3>Getting R2wd 1.3</h3>
<p>The current version of R2wd is 1.1 and Christan Ritter (the package developer), says it is a &#8220;first idea&#8221; and that a more elaborate version will soon (e.g: around July) be available on CRAN.   In the meantime, Christan was so kind as to send me a more recent version of the package, which you (until it gets uploaded to CRAN), you are welcome to download from here:<br />
<strong><a href="http://www.r-statistics.com/wp-content/uploads/2010/05/R2wd_1.3.zip">R2wd 1.3 download link</a></strong></p>
<h3>How to use R2wd to create a report &#8211; a s<strong>ample session</strong></h3>
<p><span style="font-size: 13.3333px;">Being young doesn&#8217;t prevent from R2wd to do some nice things.</span></p>
<p>Here is the text from the library(help=R2wd) :</p>
<blockquote><p>If Word is not already running, wdGet() opens a new Word document, otherwise, it establishes a COM handle to the instance which is already running. The functions wdTitle, wdHeader, wdBody, and wdParagraph can be used to inject text elements into Word. Moreover, bookmarks can be added via wdInsertBookmarks and wdGoToBookmark allows to navigate among the bookmarks which also exist. There is another set of convenience functions, wdSection, wdSubsection, and wdSubsubsection which insert headers of level 1, 2, or 3, start new ’Sections’ in Word, and add bookmarks.<br />
Graphs and dataframes can be inserted intoWord, by the wdPlot, wdTable commands. The wdTable command takes a dataframe or an array as arguments, creates a Word table of the appropriate dimensions and injects the content of the dataframe or array into it. It then formats the table in Word using elementary formating elements.<br />
The functions wdApplyTheme and wdApplyTemplate allow to work with themes and templates.</p></blockquote>
<p>Here is an example sessions to demonstrate some of what is said:</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;">&nbsp;
<span style="color: #228B22;"># install.packages(&quot;R2wd&quot;)</span>
<span style="color: #228B22;"># library(help=R2wd)</span>
<span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>R2wd<span style="color: #080;">&#41;</span>
&nbsp;
&nbsp;
wdGet<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># If no word file is open, it will start a new one - can set if to have the file visiable or not</span>
wdNewDoc<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;c:<span style="color: #000099; font-weight: bold;">\\</span>This.doc&quot;</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># this creates a new file with &quot;this.doc&quot; name</span>
&nbsp;
wdApplyTemplate<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;c:<span style="color: #000099; font-weight: bold;">\\</span>This.dot&quot;</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># this applies a template</span>
&nbsp;
&nbsp;
wdTitle<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Examples of R2wd (a package to write Word documents from R)&quot;</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># adds a title to the file</span>
&nbsp;
wdSection<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Example 1 - adding text&quot;</span>, newpage <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># This can also create a header</span>
&nbsp;
wdHeading<span style="color: #080;">&#40;</span>level <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span>, <span style="color: #ff0000;">&quot;Header 2&quot;</span><span style="color: #080;">&#41;</span>
wdBody<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;This is the first example we will show&quot;</span><span style="color: #080;">&#41;</span>
wdBody<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;(Notice how, by using two different lines in wdBody, we got two different paragraphs)&quot;</span><span style="color: #080;">&#41;</span>
wdBody<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;(Notice how I can use this: '<span style="color: #000099; font-weight: bold;">\ </span>n' (without the space), to  <span style="color: #000099; font-weight: bold;">\n</span>  go to the next 
		line)&quot;</span><span style="color: #080;">&#41;</span>
wdBody<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;האם זה עובד בעברית ?&quot;</span><span style="color: #080;">&#41;</span>
wdBody<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;It doesn't work with Hebrew...&quot;</span><span style="color: #080;">&#41;</span>
wdBody<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;O.k, let's move to the next page (and the next example)&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
wdSection<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Example 2 - adding tables&quot;</span>, newpage <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
wdBody<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Table using 'format'&quot;</span><span style="color: #080;">&#41;</span>
wdTable<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">format</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">head</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
wdBody<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Table without using 'format'&quot;</span><span style="color: #080;">&#41;</span>
wdTable<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">head</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">mtcars</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
&nbsp;
wdSection<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Example 3 - adding lm summary&quot;</span>, newpage <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;">## Example from  ?lm </span>
ctl <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">4.17</span>,<span style="color: #ff0000;">5.58</span>,<span style="color: #ff0000;">5.18</span>,<span style="color: #ff0000;">6.11</span>,<span style="color: #ff0000;">4.50</span>,<span style="color: #ff0000;">4.61</span>,<span style="color: #ff0000;">5.17</span>,<span style="color: #ff0000;">4.53</span>,<span style="color: #ff0000;">5.33</span>,<span style="color: #ff0000;">5.14</span><span style="color: #080;">&#41;</span>
trt <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">4.81</span>,<span style="color: #ff0000;">4.17</span>,<span style="color: #ff0000;">4.41</span>,<span style="color: #ff0000;">3.59</span>,<span style="color: #ff0000;">5.87</span>,<span style="color: #ff0000;">3.83</span>,<span style="color: #ff0000;">6.03</span>,<span style="color: #ff0000;">4.89</span>,<span style="color: #ff0000;">4.32</span>,<span style="color: #ff0000;">4.69</span><span style="color: #080;">&#41;</span>
group <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">gl</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">2</span>,<span style="color: #ff0000;">10</span>,<span style="color: #ff0000;">20</span>, <span style="color: #0000FF; font-weight: bold;">labels</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Ctl&quot;</span>,<span style="color: #ff0000;">&quot;Trt&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
weight <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>ctl, trt<span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># This wouldn't work!</span>
<span style="color: #228B22;"># temp &lt;- summary(lm(weight ~ group))</span>
<span style="color: #228B22;"># wdBody(temp)</span>
&nbsp;
<span style="color: #228B22;"># Here is a solution for how to implent the summary.lm output to word</span>
wdBody.<span style="">anything</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>output<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
	<span style="color: #228B22;"># This function takes the output of an object and prints it line by line into the word document</span>
	<span style="color: #228B22;"># Notice that in many cases you will need to change the text font into courier new roman...</span>
	a <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">capture.<span style="">output</span></span><span style="color: #080;">&#40;</span>output<span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span>i <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #0000FF; font-weight: bold;">seq_along</span><span style="color: #080;">&#40;</span>a<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		wdBody<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">format</span><span style="color: #080;">&#40;</span>a<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
temp <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">summary</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">lm</span><span style="color: #080;">&#40;</span>weight ~ group<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
wdBody.<span style="">anything</span><span style="color: #080;">&#40;</span>temp<span style="color: #080;">&#41;</span>
&nbsp;
&nbsp;
&nbsp;
wdSection<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Example 4 - Inserting some plots&quot;</span>, newpage <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
&nbsp;
wdPlot<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span><span style="color: #080;">&#41;</span>, plotfun <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">plot</span>, height <span style="color: #080;">=</span> <span style="color: #ff0000;">10</span>, width <span style="color: #080;">=</span><span style="color: #ff0000;">20</span>, pointsize <span style="color: #080;">=</span> <span style="color: #ff0000;">20</span><span style="color: #080;">&#41;</span>
wdPlot<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span><span style="color: #080;">&#41;</span>, plotfun <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">plot</span>, height <span style="color: #080;">=</span> <span style="color: #ff0000;">10</span>, width <span style="color: #080;">=</span><span style="color: #ff0000;">20</span>, pointsize <span style="color: #080;">=</span> <span style="color: #ff0000;">20</span><span style="color: #080;">&#41;</span>
wdPlot<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span><span style="color: #080;">&#41;</span>, plotfun <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">plot</span>, height <span style="color: #080;">=</span> <span style="color: #ff0000;">10</span>, width <span style="color: #080;">=</span><span style="color: #ff0000;">20</span>, pointsize <span style="color: #080;">=</span> <span style="color: #ff0000;">50</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># wdPageBreak()</span>
&nbsp;
&nbsp;
wdSave<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;c:<span style="color: #000099; font-weight: bold;">\\</span>This.doc&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># save current file (can say what file name to use)</span>
wdQuit<span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># close the word file</span></pre></div></div>

<p><strong>Update:</strong><br />
Upon reading my post, Chris suggested that I&#8217;ll also add a note here about <a href="http://cran.r-project.org/web/packages/SWordInstaller/index.html">SWORD</a>, a tool written by Thomas Baier (the creator of the StatconnDCOM server) which allows to include R-code in a Sweave-like fashion in Word documents. Here is a link to the project: <a href="http://rcom.univie.ac.at">http://rcom.univie.ac.at</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/05/exporting-r-output-to-ms-word-with-r2wd-an-example-session/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>The new GUI for ggplot2 (using Deducer) &#8211; the designer wants your opinion</title>
		<link>http://www.r-statistics.com/2010/05/the-new-gui-for-ggplot2-using-deducer-the-designer-wants-your-opinion/</link>
		<comments>http://www.r-statistics.com/2010/05/the-new-gui-for-ggplot2-using-deducer-the-designer-wants-your-opinion/#comments</comments>
		<pubDate>Sat, 01 May 2010 14:29:22 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[deducer]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[GUI]]></category>
		<category><![CDATA[interfaces]]></category>
		<category><![CDATA[R GUI]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=331</guid>
		<description><![CDATA[After discovering that R is expected (this summer) to have a GUI for ggplot2 (through deducer), I later found Ian&#8217;s gsoc proposal for this GUI.  Since the system is in it&#8217;s early stages of development, Ian has invited people to give comments, input and critique on his plans for the project. For your convenience (and with Ian&#8217;s permission), I am reposting his proposal here. You are welcome to send him feedback by e-mailing him (at: ifellows@gmail.com), or by leaving a [...]]]></description>
			<content:encoded><![CDATA[<p>After <a href="http://www.r-statistics.com/2010/04/r-and-the-google-summer-of-code-2010-accepted-students-and-projects/">discovering that R is expected (this summer) to have a GUI for ggplot2</a> (through <a href="http://cran.r-project.org/web/packages/Deducer/index.html">deducer</a>), I later found <a href="http://neolab.stat.ucla.edu/cranstats/gsoc.pdf">Ian&#8217;s gsoc proposal</a> for this GUI.  Since the system is in it&#8217;s early stages of development, Ian has invited people to give comments, input and critique on his plans for the project.</p>
<p>For your convenience (and with Ian&#8217;s permission), I am reposting his proposal here.  You are welcome to send him feedback by e-mailing him (at: ifellows@gmail.com), or by leaving a comment here (and I will direct him to your comment).</p>
<p><span id="more-331"></span></p>
<p class="gde-text"><a href="http://neolab.stat.ucla.edu/cranstats/gsoc.pdf" target="_blank" class="gde-link">Download (PDF, 2.9MB)</a></p>
<iframe src="http://www.r-statistics.com/wp-content/plugins/google-document-embedder/proxy.php?url=http%3A%2F%2Fneolab.stat.ucla.edu%2Fcranstats%2Fgsoc.pdf&hl=cs&gdet=&embedded=true" width="500" height="700" frameborder="0" style="min-width:305px;" class="gde-frame"></iframe>


]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/05/the-new-gui-for-ggplot2-using-deducer-the-designer-wants-your-opinion/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>R is going to have a GUI to ggplot2! (by the end of this years google-summer-of-code)</title>
		<link>http://www.r-statistics.com/2010/04/r-and-the-google-summer-of-code-2010-accepted-students-and-projects/</link>
		<comments>http://www.r-statistics.com/2010/04/r-and-the-google-summer-of-code-2010-accepted-students-and-projects/#comments</comments>
		<pubDate>Mon, 26 Apr 2010 20:46:40 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[deducer]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[google summer of code]]></category>
		<category><![CDATA[gsoc]]></category>
		<category><![CDATA[GUI]]></category>
		<category><![CDATA[news]]></category>
		<category><![CDATA[R GUI]]></category>
		<category><![CDATA[R news]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=320</guid>
		<description><![CDATA[I was delighted to see the following e-mail post from Dirk Eddelbuettel regarding the google-summer-of-code R google group: * * * Earlier today Google finalised student / mentor pairings and allocations for the Google Summer of Code 2010 (GSoC 2010). The R Project is happy to announce that the following students have been accepted: Colin Rundel, &#8220;rgeos &#8211; an R wrapper for GEOS&#8221;, mentored by Roger Bivand of the Norges Handelshoyskole, Norway Ian Fellows, &#8220;A GUI for Graphics using ggplot2 [...]]]></description>
			<content:encoded><![CDATA[<p>I was delighted to see the following<del datetime="2010-04-27T05:29:29+00:00"> e-mail </del><a href="http://dirk.eddelbuettel.com/blog/2010/04/26/#gsoc2010_r_students">post from Dirk Eddelbuettel</a> regarding the google-summer-of-code R google group:<br />
*  *  *</p>
<p>Earlier today Google finalised student / mentor pairings and allocations for<br />
the Google Summer of Code 2010 (GSoC 2010).  The R Project is happy to<br />
announce that the following students have been accepted:</p>
<p>  Colin Rundel, &#8220;rgeos &#8211; an R wrapper for GEOS&#8221;, mentored by Roger Bivand of<br />
     the Norges Handelshoyskole, Norway</p>
<p>  Ian Fellows, &#8220;A GUI for Graphics using ggplot2 and Deducer&#8221;, mentored by<br />
     Hadley Wickham of Rice University, USA</p>
<p>  Chidambaram Annamalai, &#8220;rdx &#8211; Automatic Differentiation in R&#8221;, mentored by<br />
     John Nash of University of Ottawa, Canada</p>
<p>  Yasuhisa Yoshida, &#8220;NoSQL interface for R&#8221;, mentored by Dirk Eddelbuettel,<br />
     Chicago, USA</p>
<p>  Felix Schoenbrodt, &#8220;Social Relations Analyses in R&#8221;, mentored by Stefan<br />
     Schmukle, Universitaet Muenster, Germany</p>
<p>  Details about all proposals are on the R Wiki page for the GSoC 2010 at<br />
  <a href="http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010">http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010</a></p>
<p>The R Project is honoured to have received its highest number of student<br />
allocations yet, and looks forward to an exciting Summer of Code.  Please<br />
join me in welcoming our new students.</p>
<p>At this time, I would also like to thank all the other students who have<br />
applied for working with R in this Summer of Code. With a limited number of<br />
available slots, not all proposals can be accepted &#8212; but I hope that those<br />
not lucky enough to have been granted a slot will continue to work with R and<br />
towards making contributions within the R world.</p>
<p>I would also like to express my thanks to all other mentors who provided for<br />
a record number of proposals.  Without mentors and their project ideas we<br />
would not have a Summer of Code &#8212; so hopefully we will see you again next<br />
year.</p>
<p>  Regards,</p>
<p>  Dirk (acting as R/GSoC 2010 admin)</p>
<p>*  *  *</p>
<p>From all the projects, the one I am most excited about is:<br />
Ian Fellows, &#8220;A GUI for Graphics using ggplot2 and Deducer&#8221;, mentored by Hadley Wickham of Rice University, USA</p>
<p><a href="http://ifellows.ucsd.edu/pmwiki/pmwiki.php?n=Main.DeducerManual">Deducer </a> (text from the website) attempts to be a free easy to use alternative to proprietary data analysis software such as SPSS, JMP, and Minitab. It has a menu system to do common data manipulation and analysis tasks, and an excel-like spreadsheet in which to view and edit data frames. The goal of the project is to two-fold.</p>
<ul>
<li>Provide an intuitive interface so that non-technical users can learn and perform analyses without programming getting in their way.</li>
<li>Increase the efficiency of expert R users when performing common tasks by replacing hundreds of keystrokes with a few mouse clicks. Also, as much as possible the GUI should not get in their way if they just want to do some programming.
</li>
</ul>
<p>Deducer is designed to be used with the Java based R console JGR, though it supports a number of other R environments (e.g. Windows RGUI and RTerm).</p>
<p>This combination (of Deducer and ggplot2) might finally provide the bridge to the layman-statistician that some people <a href="http://www.thejuliagroup.com/blog/?p=433">recently wrote</a> to be one of R&#8217;s weak spots (while <a href="http://www.r-statistics.com/2010/04/an-article-attacking-r-gets-responses-from-the-r-blogosphere-some-reflections/">other bloogers wrote back</a> that this is o.k., still no one refuted that R doesn&#8217;t compete with the point-and-click of softwares like SPSS or JMP.)<br />
I came across Ian in the discussion forums, where he provided very kind help to his package &#8220;deducer&#8221;.  Coupled with having Hadley as his mentor, I am very optimistic about the prospects of seeing this project reaching very high standards.<br />
Very exciting development indeed!</p>
<p><strong>Update</strong>: Ian&#8217;s proposal is available to view <a href="http://neolab.stat.ucla.edu/cranstats/gsoc.pdf">here</a>.</p>
<p>p.s: for some intuition about how a GUI for ggplot2 can look like, have a look at <a href="http://www.r-statistics.com/2010/04/jeroen-oomss-ggplot2-web-interface-a-new-version-released-v0-2/">this video of Jeroen Ooms’s ggplot2 web interface</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/04/r-and-the-google-summer-of-code-2010-accepted-students-and-projects/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
