<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>R-statistics blog &#187; R code</title>
	<atom:link href="http://www.r-statistics.com/tag/r-code/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.r-statistics.com</link>
	<description>Writing about statistics with R, and open source stuff (software, data, community)</description>
	<lastBuildDate>Thu, 29 Jul 2010 01:51:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Clustergram: visualization and diagnostics for cluster analysis (R code)</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/</link>
		<comments>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/#comments</comments>
		<pubDate>Tue, 15 Jun 2010 08:22:34 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[base graphics]]></category>
		<category><![CDATA[cluster analysis]]></category>
		<category><![CDATA[clustergram]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[Dendrogram]]></category>
		<category><![CDATA[diagnose]]></category>
		<category><![CDATA[diagnosing]]></category>
		<category><![CDATA[diagnostics]]></category>
		<category><![CDATA[functions]]></category>
		<category><![CDATA[ggplot]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[hierarchical clustering]]></category>
		<category><![CDATA[iris]]></category>
		<category><![CDATA[iris data set]]></category>
		<category><![CDATA[large data]]></category>
		<category><![CDATA[matlines]]></category>
		<category><![CDATA[non-hierarchical]]></category>
		<category><![CDATA[parallel coordinates]]></category>
		<category><![CDATA[R code]]></category>
		<category><![CDATA[R functions]]></category>
		<category><![CDATA[tree]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=391</guid>
		<description><![CDATA[About Clustergrams In 2002, Matthias Schonlau published in &#8220;The Stata Journal&#8221; an article named &#8220;The Clustergram: A graph for visualizing hierarchical and . As explained in the abstract: In hierarchical cluster analysis dendrogram graphs are used to visualize how clusters are formed. I propose an alternative graph named “clustergram” to examine how cluster members are assigned to clusters as the number of clusters increases. This graph is useful in exploratory analysis for non-hierarchical clustering algorithms like k-means and for hierarchical [...]]]></description>
			<content:encoded><![CDATA[<h3>About Clustergrams</h3>
<p>In 2002, <a href="http://www.schonlau.net/clustergram.html">Matthias Schonlau </a>published in &#8220;The Stata Journal&#8221; an article named &#8220;<a href="https://docs.google.com/viewer?url=http://www.schonlau.net/publication/02stata_clustergram.pdf">The Clustergram: A graph for visualizing hierarchical and </a>.  As explained in the abstract:</p>
<blockquote><p>In hierarchical cluster analysis dendrogram graphs are used to visualize how clusters are formed. I propose an alternative graph named “clustergram” to examine how cluster members are assigned to clusters as the number of clusters increases.<br />
This graph is useful in exploratory analysis for non-hierarchical clustering algorithms like k-means and for hierarchical cluster algorithms when the number of observations is large enough to make dendrograms impractical.</p></blockquote>
<p>A <a href="https://docs.google.com/viewer?url=http://www.schonlau.net/publication/04compstat_clustergram.pdf">similar article</a> was later written and was (maybe) published in &#8220;computational statistics&#8221;.</p>
<p>Both articles gives some nice background to known methods like k-means and methods for hierarchical clustering, and then goes on to present examples of using these methods (with the Clustergarm) to analyse some datasets.</p>
<p>Personally, I understand the clustergram to be a type of parallel coordinates plot where each observation is given a vector.  The vector contains the observation&#8217;s location according to how many clusters the dataset was split into.  The scale of the vector is the scale of the first principal component of the data. </p>
<h3>Clustergram in R (a basic function)</h3>
<p>After finding out about this method of visualization, I was hunted by the curiosity to play with it a bit.  Therefore, and since I didn&#8217;t find any implementation of the graph in R, I went about writing the code to implement it.</p>
<p>The code only works for kmeans, but it shows how such a plot can be produced, and could be later modified so to offer methods that will connect with different clustering algorithms.</p>
<p>The function I present here gets a data.frame/matrix with a row for each observation, and the variable dimensions present in the columns.<br />
The function assumes the data is scaled.<br />
The function then goes about calculating the cluster centers for our data, for varying number of clusters.<br />
For each cluster iteration, the cluster centers are multiplied by the first loading of the principal components of the original data.  Thus offering a weighted mean of the each cluster center dimensions that might give a decent representation of that cluster (this method has the known limitations of using the first component of a PCA for dimensionality reduction, but I won&#8217;t go into that in this post).<br />
Finally all of our data points are ordered according to their respective cluster first component, and plotted against the number of clusters (thus creating the clustergram).</p>
<p>My thank goes to <a href="http://had.co.nz/">Hadley Wickham</a> for offering some good tips on how to prepare the graph.</p>
<p>Here is the code (example follows)</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;">&nbsp;
&nbsp;
clustergram.<span style="">kmeans</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>Data, k, ...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
	<span style="color: #228B22;"># this is the type of function that the clustergram</span>
	<span style="color: #228B22;"># 	function takes for the clustering.</span>
	<span style="color: #228B22;"># 	using similar structure will allow implementation of different clustering algorithms</span>
&nbsp;
	<span style="color: #228B22;">#	It returns a list with two elements:</span>
	<span style="color: #228B22;">#	cluster = a vector of length of n (the number of subjects/items)</span>
	<span style="color: #228B22;">#				indicating to which cluster each item belongs.</span>
	<span style="color: #228B22;">#	centers = a k dimensional vector.  Each element is 1 number that represent that cluster</span>
	<span style="color: #228B22;">#				In our case, we are using the weighted mean of the cluster dimensions by </span>
	<span style="color: #228B22;">#				Using the first component (loading) of the PCA of the Data.</span>
&nbsp;
	cl <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">kmeans</span><span style="color: #080;">&#40;</span>Data, k,...<span style="color: #080;">&#41;</span>
&nbsp;
	cluster <span style="color: #080;">&lt;-</span> cl$cluster
	centers <span style="color: #080;">&lt;-</span> cl$centers <span style="color: #080;">%*%</span> <span style="color: #0000FF; font-weight: bold;">princomp</span><span style="color: #080;">&#40;</span>Data<span style="color: #080;">&#41;</span>$loadings<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>	<span style="color: #228B22;"># 1 number per center</span>
												<span style="color: #228B22;"># here we are using the weighted mean for each</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">list</span><span style="color: #080;">&#40;</span>
				cluster <span style="color: #080;">=</span> cluster,
				centers <span style="color: #080;">=</span> centers
			<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>		
&nbsp;
clustergram.<span style="">plot</span>.<span style="">matlines</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>X,Y, k.<span style="">range</span>, 
											x.<span style="">range</span>, y.<span style="">range</span> , COL, 
											add.<span style="">center</span>.<span style="">points</span> , centers.<span style="">points</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;white&quot;</span>, xlim <span style="color: #080;">=</span> x.<span style="">range</span>, ylim <span style="color: #080;">=</span> y.<span style="">range</span>,
			axes <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">F</span>,
			xlab <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;Number of clusters (k)&quot;</span>, ylab <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;PCA weighted Mean of the clusters&quot;</span>, main <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;Clustergram of the PCA-weighted Mean of the clusters k-mean clusters vs number of clusters (k)&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">axis</span><span style="color: #080;">&#40;</span>side <span style="color: #080;">=</span><span style="color: #ff0000;">1</span>, at <span style="color: #080;">=</span> k.<span style="">range</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">axis</span><span style="color: #080;">&#40;</span>side <span style="color: #080;">=</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">abline</span><span style="color: #080;">&#40;</span>v <span style="color: #080;">=</span> k.<span style="">range</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;grey&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
		<span style="color: #0000FF; font-weight: bold;">matlines</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">t</span><span style="color: #080;">&#40;</span>X<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">t</span><span style="color: #080;">&#40;</span>Y<span style="color: #080;">&#41;</span>, pch <span style="color: #080;">=</span> <span style="color: #ff0000;">19</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> COL, lty <span style="color: #080;">=</span> <span style="color: #ff0000;">1</span>, lwd <span style="color: #080;">=</span> <span style="color: #ff0000;">1.5</span><span style="color: #080;">&#41;</span>
&nbsp;
		<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>add.<span style="">center</span>.<span style="">points</span><span style="color: #080;">&#41;</span>
		<span style="color: #080;">&#123;</span>
			<span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>plyr<span style="color: #080;">&#41;</span>
&nbsp;
			xx <span style="color: #080;">&lt;-</span> ldply<span style="color: #080;">&#40;</span>centers.<span style="">points</span>, <span style="color: #0000FF; font-weight: bold;">rbind</span><span style="color: #080;">&#41;</span>
			<span style="color: #0000FF; font-weight: bold;">points</span><span style="color: #080;">&#40;</span>xx$y~xx$x, pch <span style="color: #080;">=</span> <span style="color: #ff0000;">19</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;red&quot;</span>, cex <span style="color: #080;">=</span> <span style="color: #ff0000;">1.3</span><span style="color: #080;">&#41;</span>
&nbsp;
			<span style="color: #228B22;"># add points	</span>
			<span style="color: #228B22;"># temp &lt;- l_ply(centers.points, function(xx) {</span>
									<span style="color: #228B22;"># with(xx,points(y~x, pch = 19, col = &quot;red&quot;, cex = 1.3))</span>
									<span style="color: #228B22;"># points(xx$y~xx$x, pch = 19, col = &quot;red&quot;, cex = 1.3)</span>
									<span style="color: #228B22;"># return(1)</span>
									<span style="color: #228B22;"># })</span>
						<span style="color: #228B22;"># We assign the lapply to a variable (temp) only to suppress the lapply &quot;NULL&quot; output</span>
		<span style="color: #080;">&#125;</span>	
	<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
&nbsp;
clustergram <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>Data, k.<span style="">range</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">:</span><span style="color: #ff0000;">10</span> , 
							clustering.<span style="">function</span> <span style="color: #080;">=</span> clustergram.<span style="">kmeans</span>,
							clustergram.<span style="">plot</span> <span style="color: #080;">=</span> clustergram.<span style="">plot</span>.<span style="">matlines</span>, 
							line.<span style="">width</span> <span style="color: #080;">=</span> .004, add.<span style="">center</span>.<span style="">points</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
	<span style="color: #228B22;"># Data - should be a scales matrix.  Where each column belongs to a different dimension of the observations</span>
	<span style="color: #228B22;"># k.range - is a vector with the number of clusters to plot the clustergram for</span>
	<span style="color: #228B22;"># clustering.function - this is not really used, but offers a bases to later extend the function to other algorithms </span>
	<span style="color: #228B22;">#			Although that would  more work on the code</span>
	<span style="color: #228B22;"># line.width - is the amount to lift each line in the plot so they won't superimpose eachother</span>
	<span style="color: #228B22;"># add.center.points - just assures that we want to plot points of the cluster means</span>
&nbsp;
	n <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">dim</span><span style="color: #080;">&#40;</span>Data<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
&nbsp;
	PCA.1 <span style="color: #080;">&lt;-</span> Data <span style="color: #080;">%*%</span> <span style="color: #0000FF; font-weight: bold;">princomp</span><span style="color: #080;">&#40;</span>Data<span style="color: #080;">&#41;</span>$loadings<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>	<span style="color: #228B22;"># first principal component of our data</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>colorspace<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>
			COL <span style="color: #080;">&lt;-</span> heat_hcl<span style="color: #080;">&#40;</span>n<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span>PCA.1<span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span>	<span style="color: #228B22;"># line colors</span>
		<span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>
			COL <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rainbow</span><span style="color: #080;">&#40;</span>n<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span>PCA.1<span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span>	<span style="color: #228B22;"># line colors</span>
			<span style="color: #0000FF; font-weight: bold;">warning</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'Please consider installing the package &quot;colorspace&quot; for prittier colors'</span><span style="color: #080;">&#41;</span>
		<span style="color: #080;">&#125;</span>
&nbsp;
	line.<span style="">width</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span>line.<span style="">width</span>, n<span style="color: #080;">&#41;</span>
&nbsp;
	Y <span style="color: #080;">&lt;-</span> NULL	<span style="color: #228B22;"># Y matrix</span>
	X <span style="color: #080;">&lt;-</span> NULL	<span style="color: #228B22;"># X matrix</span>
&nbsp;
	centers.<span style="">points</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">list</span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span>k <span style="color: #0000FF; font-weight: bold;">in</span> k.<span style="">range</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		k.<span style="">clusters</span> <span style="color: #080;">&lt;-</span> clustering.<span style="">function</span><span style="color: #080;">&#40;</span>Data, k<span style="color: #080;">&#41;</span>
&nbsp;
		clusters.<span style="">vec</span> <span style="color: #080;">&lt;-</span> k.<span style="">clusters</span>$cluster
			<span style="color: #228B22;"># the.centers &lt;- apply(cl$centers,1, mean)</span>
		the.<span style="">centers</span> <span style="color: #080;">&lt;-</span> k.<span style="">clusters</span>$centers 
&nbsp;
		noise <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">unlist</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">tapply</span><span style="color: #080;">&#40;</span>line.<span style="">width</span>, clusters.<span style="">vec</span>, <span style="color: #0000FF; font-weight: bold;">cumsum</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">seq_along</span><span style="color: #080;">&#40;</span>clusters.<span style="">vec</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span>clusters.<span style="">vec</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span>	
		<span style="color: #228B22;"># noise &lt;- noise - mean(range(noise))</span>
		y <span style="color: #080;">&lt;-</span> the.<span style="">centers</span><span style="color: #080;">&#91;</span>clusters.<span style="">vec</span><span style="color: #080;">&#93;</span> <span style="color: #080;">+</span> noise
		Y <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span>Y, y<span style="color: #080;">&#41;</span>
		x <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span>k, <span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
		X <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span>X, x<span style="color: #080;">&#41;</span>
&nbsp;
		centers.<span style="">points</span><span style="color: #080;">&#91;</span><span style="color: #080;">&#91;</span>k<span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>y <span style="color: #080;">=</span> the.<span style="">centers</span> , x <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span>k , k<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>	
	<span style="color: #228B22;">#	points(the.centers ~ rep(k , k), pch = 19, col = &quot;red&quot;, cex = 1.5)</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
	x.<span style="">range</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">range</span><span style="color: #080;">&#40;</span>k.<span style="">range</span><span style="color: #080;">&#41;</span>
	y.<span style="">range</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">range</span><span style="color: #080;">&#40;</span>PCA.1<span style="color: #080;">&#41;</span>
&nbsp;
	clustergram.<span style="">plot</span><span style="color: #080;">&#40;</span>X,Y, k.<span style="">range</span>, 
											x.<span style="">range</span>, y.<span style="">range</span> , COL, 
											add.<span style="">center</span>.<span style="">points</span> , centers.<span style="">points</span><span style="color: #080;">&#41;</span>
&nbsp;
&nbsp;
<span style="color: #080;">&#125;</span></pre></div></div>

<h3>Example on the iris dataset</h3>
<p>The<a href="http://en.wikipedia.org/wiki/Iris_flower_data_set"> iris data set</a> is a favorite example of many <a href="http://www.r-bloggers.com/?s=iris">R bloggers </a> when writing about <a href="http://opendatagroup.com/2009/10/21/r-accessors-explained/">R accessors </a>, <a href="http://learnr.wordpress.com/2009/10/06/export-data-frames-to-multi-worksheet-excel-file/">Data Exporting</a>, <a href="http://yihui.name/en/2009/09/how-to-import-ms-excel-data-into-r/">Data importing</a>, and for <a href="http://weitaiyun.blogspot.com/2009/03/unison-graph-and-parallel-coordinate.html">different </a><a href="http://weitaiyun.blogspot.com/2009/03/scatterplots.html">visualization </a>techniques.<br />
So it seemed only natural to experiment on it here.</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">iris</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">250</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>cex.<span style="">lab</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.5</span>, cex.<span style="">main</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.2</span><span style="color: #080;">&#41;</span>
Data <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">scale</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">iris</span><span style="color: #080;">&#91;</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">5</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notice I am scaling the vectors)</span>
clustergram<span style="color: #080;">&#40;</span>Data, k.<span style="">range</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">:</span><span style="color: #ff0000;">8</span>, line.<span style="">width</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.004</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notice how I am using line.width.  Play with it on your problem, according to the scale of Y.</span></pre></div></div>

<p>Here is the output:<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-1.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-1.png" alt="" title="clustergram 1" width="500"></a></p>
<p>Looking at the image we can notice a few interesting things.  We notice that one of the clusters formed (the lower one) stays as is no matter how many clusters we are allowing (except for one observation that goes way and then beck).<br />
We can also see that the second split is a solid one (in the sense that it splits the first cluster into two clusters which are not &#8220;close&#8221; to each other, and that about half the observations goes to each of the new clusters).<br />
And then notice how moving to 5 clusters makes almost no difference.<br />
Lastly, notice how when going for 8 clusters, we are practically left with 4 clusters (remember &#8211; this is according the mean of cluster centers by the loading of the first component of the PCA on the data)</p>
<p>If I where to take something from this graph, I would say I have a strong tendency to use 3-4 clusters on this data.</p>
<p>But wait, did our clustering algorithm do a stable job?<br />
Let&#8217;s try running the algorithm 6 more times (each run will have a different starting point for the clusters)</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">500</span><span style="color: #080;">&#41;</span>
Data <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">scale</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">iris</span><span style="color: #080;">&#91;</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">5</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notice I am scaling the vectors)</span>
<span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>cex.<span style="">lab</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.2</span>, cex.<span style="">main</span> <span style="color: #080;">=</span> .7<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>mfrow <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">3</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span>i <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">6</span><span style="color: #080;">&#41;</span> clustergram<span style="color: #080;">&#40;</span>Data, k.<span style="">range</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">:</span><span style="color: #ff0000;">8</span> , line.<span style="">width</span> <span style="color: #080;">=</span> .004, add.<span style="">center</span>.<span style="">points</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span></pre></div></div>

<p>Resulting with:  (press the image to enlarge it)<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-6.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-6.png" alt="" title="clustergram 6" width="500"></a><br />
Repeating the analysis offers even more insights.<br />
First, it would appear that until 3 clusters, the algorithm gives rather stable results.<br />
From 4 onwards we get various outcomes at each iteration.<br />
At some of the cases, we got 3 clusters when we asked for 4 or even 5 clusters.</p>
<p>Reviewing the new plots, I would prefer to go with the 3 clusters option.  Noting how the two &#8220;upper&#8221; clusters might have similar properties while the lower cluster is quite distinct from the other two.</p>
<p>By the way, the Iris data set is composed of three types of flowers.  I imagine the kmeans  had done a decent job in distinguishing the three.</p>
<h3>Limitation of the method (and a possible way to overcome it?!)</h3>
<p>It is worth noting that the current way the algorithm is built has a fundamental limitation:  The plot is good for detecting a situation where there are several clusters but each of them is clearly &#8220;bigger&#8221; then the one before it (on the first principal component of the data).</p>
<p>For example, let&#8217;s create a dataset with 3 clusters, each one is taken from a normal distribution with a higher mean:</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">250</span><span style="color: #080;">&#41;</span>
Data <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rbind</span><span style="color: #080;">&#40;</span>
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">2</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">2</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">2</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
				<span style="color: #080;">&#41;</span>				
clustergram<span style="color: #080;">&#40;</span>Data, k.<span style="">range</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">:</span><span style="color: #ff0000;">5</span> , line.<span style="">width</span> <span style="color: #080;">=</span> .004, add.<span style="">center</span>.<span style="">points</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span></pre></div></div>

<p>The resulting plot for this is the following:<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/06/Clustergram-3-ordered-clusters.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/06/Clustergram-3-ordered-clusters.png" alt="" title="Clustergram-3-ordered-clusters" width="500" class="alignnone size-full wp-image-402" /></a><br />
The image shows a clear distinction between three ranks of clusters.  There is no doubt (for me) from looking at this image, that three clusters would be the correct number of clusters.</p>
<p>But what if the clusters where different but didn&#8217;t have an ordering to them?<br />
For example, look at the following 4 dimensional data:</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">250</span><span style="color: #080;">&#41;</span>
Data <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rbind</span><span style="color: #080;">&#40;</span>
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
				<span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rnorm</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">100</span>,<span style="color: #ff0000;">1</span>, <span style="color: #0000FF; font-weight: bold;">sd</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">0.3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
				<span style="color: #080;">&#41;</span>				
clustergram<span style="color: #080;">&#40;</span>Data, k.<span style="">range</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">:</span><span style="color: #ff0000;">8</span> , line.<span style="">width</span> <span style="color: #080;">=</span> .004, add.<span style="">center</span>.<span style="">points</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span></pre></div></div>

<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/06/Clustergram-4-UNordered-clusters.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/06/Clustergram-4-UNordered-clusters.png" alt="" title="Clustergram-4-UNordered-clusters" width="500" class="alignnone size-full wp-image-403" /></a></p>
<p>In this situation, it is not clear from the location of the clusters on the Y axis that we are dealing with 4 clusters.<br />
But what is interesting, is that through the growing number of clusters, we can notice that there are 4 &#8220;strands&#8221; of data points moving more or less together (until we reached 4 clusters, at which point the clusters started breaking up).<br />
Another hope for handling this might be using the color of the lines in some way, but I haven&#8217;t yet figured out how.</p>
<h3>Clustergram with ggplot2</h3>
<p><a href="http://had.co.nz/">Hadley Wickham</a> has kindly played with recreating the clustergram using the ggplot2 engine.  You can see the result here:<br />
<a href="http://gist.github.com/439761">http://gist.github.com/439761</a><br />
And this is what he wrote about it in the comments:</p>
<blockquote><p>I’ve broken it down into three components:<br />
* run the clustering algorithm and get predictions (many_kmeans and all_hclust)<br />
* produce the data for the clustergram (clustergram)<br />
* plot it (plot.clustergram)<br />
I don’t think I have the logic behind the y-position adjustment quite right though.</p></blockquote>
<p>Here is an example of how it looks:<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-ggplot2-1.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/06/clustergram-ggplot2-1.png" alt="" title="clustergram-ggplot2-1" width="500" class="alignnone size-full wp-image-407" /></a></p>
<h3>Conclusions (some rules of thumb and questions for the future)</h3>
<p>In a first look, it would appear that the clustergram can be of use.  I can imagine using this graph to quickly run various clustering algorithms and then compare them to each other and review their stability (In the way I just demonstrated in the example above).</p>
<p>The three rules of thumb I have noticed by now are:</p>
<ol>
<li>Look at the location of the cluster points on the Y axis. See when they remain stable, when they start flying around, and what happens to them in higher number of clusters (do they re-group together)</li>
<li>Observe the strands of the datapoints.  Even if the clusters centers are not ordered, the lines for each item might (needs more research and thinking) tend to move together &#8211; hinting at the real number of clusters</li>
<li>Run the plot multiple times to observe the stability of the cluster formation (and location)</li>
</ol>
<p>Yet there is more work to be done and questions to seek answers to:</p>
<ul>
<li>The code needs to be extended to offer methods to various clustering algorithms.
</li>
<li>How can the colors of the lines be used better?
</li>
<li>How can this be done using other graphical engines (ggplot2/lattice?) &#8211; (<strong>Update</strong>: look at Hadley&#8217;s reply in the comments)
</li>
<li>What to do in case the first principal component doesn&#8217;t capture enough of the data? (maybe plot this graph to all the relevant components. but then &#8211; how do you make conclusions of it?)
</li>
<li>What other uses/conclusions can be made based on this graph?
</li>
</ul>
<p>I am looking forward to reading your input/ideas in the comments (or in reply posts).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>How to upgrade R on windows &#8211; another strategy (and the R code to do it)</title>
		<link>http://www.r-statistics.com/2010/04/changing-your-r-upgrading-strategy-and-the-r-code-to-do-it-on-windows/</link>
		<comments>http://www.r-statistics.com/2010/04/changing-your-r-upgrading-strategy-and-the-r-code-to-do-it-on-windows/#comments</comments>
		<pubDate>Fri, 23 Apr 2010 22:45:14 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[R 2.11.0]]></category>
		<category><![CDATA[R code]]></category>
		<category><![CDATA[R windows]]></category>
		<category><![CDATA[upgrade]]></category>
		<category><![CDATA[upgrading R]]></category>
		<category><![CDATA[windows]]></category>
		<category><![CDATA[windows 7]]></category>
		<category><![CDATA[windows vista]]></category>
		<category><![CDATA[windows xp]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=300</guid>
		<description><![CDATA[Update: In the end of the post I added simple step by step instruction on how to move to the new system. I STRONGLY suggest using the code only after you read the entire post. Background If you didn&#8217;t hear it by now &#8211; R 2.11.0 is out with a bunch of new features. After Andrew Gelman recently lamented the lack of an easy upgrade process for R, a Stackoverflow thread (by JD Long) invited R users to share their [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Update</strong>: In the end of the post I added simple step by step instruction on how to move to the new system.  I STRONGLY suggest using the code only after you read the entire post.</p>
<h3>Background</h3>
<p>If you didn&#8217;t <a href="http://www.statsravingmad.com/blog/infos/r-2-11-0-just-landed/">hear</a> <a href="http://blog.revolution-computing.com/2010/04/r-2110-released.html">it</a> <a href="http://onertipaday.blogspot.com/2010/04/r-2110-is-released.html">by </a>now &#8211; <a href="https://mailman.stat.ethz.ch/pipermail/r-announce/2010/000519.html">R 2.11.0 is out</a> with a bunch of new features.</p>
<p>After <a href="http://www.stat.columbia.edu/~cook/movabletype/archives/2009/08/upgrading_r.html">Andrew Gelman</a> recently lamented the lack of an easy upgrade process for R, a <a href="http://stackoverflow.com/questions/1401904/painless-way-to-install-a-new-version-of-r">Stackoverflow thread</a> (by JD Long) invited R users to share their strategies for easily upgrading R.</p>
<h3>Strategy</h3>
<p>In that thread, <a href="http://dirk.eddelbuettel.com/blog/">Dirk Eddelbuettel</a> suggested another idea for upgrading R.  His idea is of using a folder for R&#8217;s packages which is <strong>outside </strong>the standard directory tree of the installation (a different strategy then the one <a href="http://cran.r-project.org/bin/windows/base/rw-FAQ.html#What_0027s-the-best-way-to-upgrade_003f">offered on the R FAQ</a>).</p>
<p>The idea of this upgrading strategy is to save us steps in upgrading.  So when you wish to upgrade R, instead of doing the following three steps:<br />
1) download new R and install<br />
2) copy the &#8220;library&#8221; content from the old R to the new R<br />
3) upgrade all of the packages (in the library folder) to the new version of R.<br />
You could instead just have steps 1 and 3, and skip step 2.</p>
<p>For example, under windows, you might have R installed on:<br />
<code>C:\Program Files\R\R-2.11.0\</code><br />
But (in this alternative model for upgrading) you will have your packages library on a &#8220;global library folder&#8221; (global in the sense of independent of a specific R version):<br />
<code>C:\Program Files\R\library</code></p>
<p>So in order to use this strategy, you will need to do the following steps -</p>
<ol>
<li><span style="font-size: 13.3333px;">In the OLD R installation (in the first time you move to the new system of managing the upgrade):</span>
<ol>
<li><span style="font-size: 13.3333px;">Create a new global library folder (if it doesn&#8217;t exist)</span></li>
<li><span style="font-size: 13.3333px;">Copy to the new &#8220;global library folder&#8221; all of your packages from the old R installation</span></li>
<li><span style="font-size: 13.3333px;">After you move to this system &#8211; the steps 1 and 2 would <span style="text-decoration: underline;"><strong>not</strong></span> need to be repeated. (hence the advantage)</span></li>
</ol>
</li>
<li><span style="font-size: 13.3333px;">In the NEW R installation:</span>
<ol>
<li><span style="font-size: 13.3333px;">Create a new global library folder (if it doesn&#8217;t exist &#8211; in case this is your first R installation)</span></li>
<li><span style="font-size: 13.3333px;">Premenantly point to the Global library folder whenever R starts</span></li>
<li><span style="font-size: 13.3333px;">Delete from the &#8220;Global library folder&#8221; all the packages that already exist in the local library folder of the new R install (no need to have doubles)</span></li>
<li><span style="font-size: 13.3333px;">Update all packages.</span> (notice that you picked a mirror where the packages are up-to-date, you sometimes need to choose another mirror)</li>
</ol>
</li>
</ol>
<p>Thanks to <a href="http://stackoverflow.com/questions/2698269/how-do-you-change-library-location-in-r-under-windows-xp">help from Dirk</a>, David Winsemius and Uwe Ligges, I was able to write the following R code to perform all the tasks I described <img src='http://www.r-statistics.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>So first you will need to run the following code:<br />
<span id="more-300"></span></p>
<h3>Code for upgrading R</h3>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;">Old.<span style="">R</span>.<span style="">RunMe</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span> <span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;C:/Program Files/R/library&quot;</span>, quit.<span style="">R</span> <span style="color: #080;">=</span> NULL<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
<span style="color: #228B22;"># It will:</span>
<span style="color: #228B22;"># 1. Create a new global library folder (if it doesn't exist)</span>
<span style="color: #228B22;"># 2. Copy to the new &quot;global library folder&quot; all of your packages from the old R installation</span>
&nbsp;
&nbsp;
	<span style="color: #228B22;"># checking that the global lib folder exists - and if not -&gt; create it.</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">file.<span style="">exists</span></span><span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>	<span style="color: #228B22;"># If global lib folder doesn't exist - create it.</span>
		<span style="color: #0000FF; font-weight: bold;">dir.<span style="">create</span></span><span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;The path:&quot;</span> , global.<span style="">library</span>.<span style="">folder</span>, <span style="color: #ff0000;">&quot;Didn't exist - and was now created.&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;The path:&quot;</span> , global.<span style="">library</span>.<span style="">folder</span>, <span style="color: #ff0000;">&quot;already exist. (no need to create it)&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;-----------------------&quot;</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;I am now copying packages from old library folder to:&quot;</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;-----------------------&quot;</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">flush.<span style="">console</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>  <span style="color: #228B22;"># refresh the console so that the user will see the massage</span>
&nbsp;
	<span style="color: #228B22;"># Copy packages from current lib folder to the global lib folder</span>
	list.<span style="">of</span>.<span style="">dirs</span>.<span style="">in</span>.<span style="">lib</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span> <span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">R.<span style="">home</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\\</span>library<span style="color: #000099; font-weight: bold;">\\</span>&quot;</span>, sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span>,
							<span style="color: #0000FF; font-weight: bold;">list.<span style="">files</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">R.<span style="">home</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\\</span>library<span style="color: #000099; font-weight: bold;">\\</span>&quot;</span>, sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
							sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span>
	folders.<span style="">copied</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">file.<span style="">copy</span></span><span style="color: #080;">&#40;</span>from <span style="color: #080;">=</span> list.<span style="">of</span>.<span style="">dirs</span>.<span style="">in</span>.<span style="">lib</span>, 	<span style="color: #228B22;"># copy folders</span>
								to <span style="color: #080;">=</span> global.<span style="">library</span>.<span style="">folder</span>,
								overwrite <span style="color: #080;">=</span> TRUE,
								recursive <span style="color: #080;">=</span>TRUE<span style="color: #080;">&#41;</span>		
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Success.&quot;</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;We finished copying all of your packages (&quot;</span> , <span style="color: #0000FF; font-weight: bold;">sum</span><span style="color: #080;">&#40;</span>folders.<span style="">copied</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">&quot;packages ) to the new library folder at:&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;-----------------------&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
	<span style="color: #228B22;"># To quite R ?</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">null</span></span><span style="color: #080;">&#40;</span>quit.<span style="">R</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Can I close R?  y(es)/n(o)  (WARNING: your enviornment will *NOT* be saved)&quot;</span><span style="color: #080;">&#41;</span>
		answer <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">readLines</span><span style="color: #080;">&#40;</span>n<span style="color: #080;">=</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>
		answer <span style="color: #080;">&lt;-</span> quit.<span style="">R</span>
	<span style="color: #080;">&#125;</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">tolower</span><span style="color: #080;">&#40;</span>answer<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span> <span style="color: #080;">==</span> <span style="color: #ff0000;">&quot;y&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #0000FF; font-weight: bold;">quit</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">save</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;no&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
&nbsp;
New.<span style="">R</span>.<span style="">RunMe</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span> <span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;C:/Program Files/R/library&quot;</span>, 
							quit.<span style="">R</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">F</span>,
							del.<span style="">packages</span>.<span style="">that</span>.<span style="">exist</span>.<span style="">in</span>.<span style="">home</span>.<span style="">lib</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span>,
							update.<span style="">all</span>.<span style="">packages</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
<span style="color: #228B22;"># It will:</span>
<span style="color: #228B22;"># 1. Create a new global library folder (if it doesn't exist)</span>
<span style="color: #228B22;"># 2. Premenantly point to the Global library folder</span>
<span style="color: #228B22;"># 3. Make sure that in the current session - R points to the &quot;Global library folder&quot;</span>
<span style="color: #228B22;"># 4. Delete from the &quot;Global library folder&quot; all the packages that already exist in the local library folder of the new R install</span>
<span style="color: #228B22;"># 5. Update all packages.</span>
&nbsp;
&nbsp;
	<span style="color: #228B22;"># checking that the global lib folder exists - and if not -&gt; create it.</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">file.<span style="">exists</span></span><span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>	<span style="color: #228B22;"># If global lib folder doesn't exist - create it.</span>
		<span style="color: #0000FF; font-weight: bold;">dir.<span style="">create</span></span><span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;The path to the Global library (&quot;</span> , global.<span style="">library</span>.<span style="">folder</span>, <span style="color: #ff0000;">&quot;) Didn't exist - and was now created.&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;The path to the Global library (&quot;</span> , global.<span style="">library</span>.<span style="">folder</span>, <span style="color: #ff0000;">&quot;) already exist. (NO need to create it)&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
	<span style="color: #0000FF; font-weight: bold;">flush.<span style="">console</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>  <span style="color: #228B22;"># refresh the console so that the user will see the massage</span>
&nbsp;
&nbsp;
	<span style="color: #228B22;"># Based on:</span>
	<span style="color: #228B22;"># help(Startup)</span>
	<span style="color: #228B22;"># checking if &quot;Renviron.site&quot; exists - and if not -&gt; create it.</span>
	Renviron.<span style="">site</span>.<span style="">loc</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">R.<span style="">home</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\\</span>etc<span style="color: #000099; font-weight: bold;">\\</span>Renviron.site&quot;</span>, sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">file.<span style="">exists</span></span><span style="color: #080;">&#40;</span>Renviron.<span style="">site</span>.<span style="">loc</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>	<span style="color: #228B22;"># If &quot;Renviron.site&quot; doesn't exist (which it shouldn't be) - create it and add the global lib line to it.</span>
		<span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;R_LIBS=&quot;</span>,global.<span style="">library</span>.<span style="">folder</span>, sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span> ,
				<span style="color: #0000FF; font-weight: bold;">file</span> <span style="color: #080;">=</span> Renviron.<span style="">site</span>.<span style="">loc</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;The file:&quot;</span> , Renviron.<span style="">site</span>.<span style="">loc</span>, <span style="color: #ff0000;">&quot;Didn't exist - we created it and added your 'Global library link' (&quot;</span>,global.<span style="">library</span>.<span style="">folder</span>,<span style="color: #ff0000;">&quot;) to it.&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;The file:&quot;</span> , Renviron.<span style="">site</span>.<span style="">loc</span>, <span style="color: #ff0000;">&quot;existed!  make sure you add the following line by yourself:&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;R_LIBS=&quot;</span>,global.<span style="">library</span>.<span style="">folder</span>, sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;To the file:&quot;</span>,Renviron.<span style="">site</span>.<span style="">loc</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
	<span style="color: #228B22;"># Setting the global lib for this session also</span>
	.<span style="">libPaths</span><span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># This makes sure you don't need to restart R so that the new Global lib settings will take effect in this session also</span>
	<span style="color: #228B22;"># This line could have also been added to:</span>
	<span style="color: #228B22;"># /etc/Rprofile.site</span>
	<span style="color: #228B22;"># and it would do the same thing as adding &quot;Renviron.site&quot; did</span>
	<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Your library paths are: &quot;</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span>.<span style="">libPaths</span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>	
	<span style="color: #0000FF; font-weight: bold;">flush.<span style="">console</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>  <span style="color: #228B22;"># refresh the console so that the user will see the massage</span>
&nbsp;
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>del.<span style="">packages</span>.<span style="">that</span>.<span style="">exist</span>.<span style="">in</span>.<span style="">home</span>.<span style="">lib</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;We will now delete package from your Global library folder that already exist in the local-install library folder&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">flush.<span style="">console</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>  <span style="color: #228B22;"># refresh the console so that the user will see the massage</span>
		package.<span style="">to</span>.<span style="">del</span>.<span style="">from</span>.<span style="">global</span>.<span style="">lib</span> <span style="color: #080;">&lt;-</span> 		<span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span> <span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span>, <span style="color: #ff0000;">&quot;/&quot;</span>, sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span>,
													<span style="color: #0000FF; font-weight: bold;">list.<span style="">files</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">R.<span style="">home</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\\</span>library<span style="color: #000099; font-weight: bold;">\\</span>&quot;</span>, sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
													sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span>			
		number.<span style="">of</span>.<span style="">packages</span>.<span style="">we</span>.<span style="">will</span>.<span style="">delete</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sum</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">list.<span style="">files</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span>, <span style="color: #ff0000;">&quot;/&quot;</span>, sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">%</span>in<span style="color: #080;">%</span> <span style="color: #0000FF; font-weight: bold;">list.<span style="">files</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">R.<span style="">home</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\\</span>library<span style="color: #000099; font-weight: bold;">\\</span>&quot;</span>, sep <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
		deleted.<span style="">packages</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">unlink</span><span style="color: #080;">&#40;</span>package.<span style="">to</span>.<span style="">del</span>.<span style="">from</span>.<span style="">global</span>.<span style="">lib</span> , recursive <span style="color: #080;">=</span> TRUE<span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># delete all the packages from the &quot;original&quot; library folder (no need for double folders)</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span>number.<span style="">of</span>.<span style="">packages</span>.<span style="">we</span>.<span style="">will</span>.<span style="">delete</span>,<span style="color: #ff0000;">&quot;Packages where deleted.&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>update.<span style="">all</span>.<span style="">packages</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		<span style="color: #228B22;"># Based on:</span>
		<span style="color: #228B22;"># http://cran.r-project.org/bin/windows/base/rw-FAQ.html#What_0027s-the-best-way-to-upgrade_003f</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;We will now update all your packges&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">flush.<span style="">console</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>  <span style="color: #228B22;"># refresh the console so that the user will see the massage</span>
&nbsp;
		<span style="color: #0000FF; font-weight: bold;">update.<span style="">packages</span></span><span style="color: #080;">&#40;</span>checkBuilt<span style="color: #080;">=</span>TRUE, ask<span style="color: #080;">=</span>FALSE<span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
	<span style="color: #228B22;"># To quite R ?</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>quit.<span style="">R</span><span style="color: #080;">&#41;</span> <span style="color: #0000FF; font-weight: bold;">quit</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">save</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;no&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span></pre></div></div>

<p>Then you will want to run, on your old R installation, this:</p>
<pre>
Old.R.RunMe()
</pre>
<p>And on your new R installation, this:</p>
<pre>
New.R.RunMe()
</pre>
<h3>Update &#8211; simple two line code to run when upgrading R</h3>
<p>(Please do not try the following code before reading this post and understanding what it does)</p>
<p>In order to move your R upgrade to the new (simpler) system, do the following:<br />
1) Download and install the new version of R<br />
2) Open your old R and run &#8211; </p>
<pre>
source("http://www.r-statistics.com/wp-content/uploads/2010/04/upgrading-R-on-windows.r.txt")
Old.R.RunMe()
</pre>
<p>(wait until it finishes)<br />
3) Open your new R and run</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2010/04/upgrading-R-on-windows.r.txt&quot;</span><span style="color: #080;">&#41;</span>
New.<span style="">R</span>.<span style="">RunMe</span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span></pre></div></div>

<p>(wait until it finishes) </p>
<p>Once you do this, then from now on, whenever you will upgrade to a new R, all you will need to do only the following TWO (instead of three) steps:<br />
1) Download and install the new version of R<br />
2) Open your new R and run</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2010/04/upgrading-R-on-windows.r.txt&quot;</span><span style="color: #080;">&#41;</span>
New.<span style="">R</span>.<span style="">RunMe</span><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span></pre></div></div>

<p>(wait until it finishes) </p>
<p>And that&#8217;s it.</p>
<p><strong>Updates for windows 7 users</strong>:<br />
For windows 7 users there are two issues.  The first is permissions.  The default permissions of the user won&#8217;t let you create folders and files under the &#8220;program files&#8221; directory. <del datetime="2010-07-12T15:51:58+00:00">To fix this you can either login as the administrator and run the code from there, or change your own user permissions (following the steps described <a href="http://www.blogsdna.com/2159/how-to-take-ownership-grant-permissions-to-access-files-folder-in-windows-7.htm">here</a>).</del><br />
There are several fixes for this, the best one (in my view) is to run R with administrator privileges by doing the following steps:</p>
<p>You can do this by following steps (<a href="http://superuser.com/questions/162680/having-a-shortcut-run-with-administrator-permissions-win-7">My thanks goes to superuser</a>):</p>
<ul>
<ol>
Right click on the R shortcut
</ol>
<ol>
Click on Properties</ol>
<ol>
Select the Compatibility tab</ol>
<ol>
At the bottom click &#8220;Change settings for all users&#8221;</ol>
<ol>
Again at the bottom select to &#8220;Run this program as an administrator&#8221;</ol>
</ul>
<p>The second issue is that the folder in which R is installed might be different (if you installed the 32 bit version on win 7), in which case you will need to run the following commands (notice the use of the &#8220;global.library.folder&#8221; paramater)</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2010/04/upgrading-R-on-windows.r.txt&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #228B22;"># in the old R</span>
Old.<span style="">R</span>.<span style="">RunMe</span><span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;C:<span style="color: #000099; font-weight: bold;">\\</span>Program Files (x86)<span style="color: #000099; font-weight: bold;">\\</span>R<span style="color: #000099; font-weight: bold;">\\</span>library&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #228B22;"># in the new R</span>
New.<span style="">R</span>.<span style="">RunMe</span><span style="color: #080;">&#40;</span>global.<span style="">library</span>.<span style="">folder</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;C:<span style="color: #000099; font-weight: bold;">\\</span>Program Files (x86)<span style="color: #000099; font-weight: bold;">\\</span>R<span style="color: #000099; font-weight: bold;">\\</span>library&quot;</span><span style="color: #080;">&#41;</span></pre></div></div>

<p><strong>* * * *</strong></p>
<p>If you have any more suggestions on how to make this code better &#8211; please <strong>do share</strong>.<br />
<del datetime="2010-07-01T16:33:50+00:00">(After some measure of review will be given to this code, I would upload it to a file for easy running through &#8220;source(&#8230;)&#8221; )</del></p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/04/changing-your-r-upgrading-strategy-and-the-r-code-to-do-it-on-windows/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
		<item>
		<title>Correlation scatter-plot matrix for ordered-categorical data</title>
		<link>http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/</link>
		<comments>http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/#comments</comments>
		<pubDate>Wed, 07 Apr 2010 21:37:26 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[correlation]]></category>
		<category><![CDATA[correlation matrix]]></category>
		<category><![CDATA[correlation scatter plot]]></category>
		<category><![CDATA[non-parametric]]></category>
		<category><![CDATA[non-parametric test]]></category>
		<category><![CDATA[nonparametric]]></category>
		<category><![CDATA[nonparametric test]]></category>
		<category><![CDATA[R code]]></category>
		<category><![CDATA[scatter plot]]></category>
		<category><![CDATA[scatter plot matrix]]></category>
		<category><![CDATA[spearman correlation]]></category>
		<category><![CDATA[spearman test]]></category>
		<category><![CDATA[stackoverflow]]></category>
		<category><![CDATA[survey]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=256</guid>
		<description><![CDATA[When analyzing a questionnaire, one often wants to view the correlation between two or more Likert questionnaire item&#8217;s (for example: two ordered categorical vectors ranging from 1 to 5). When dealing with several such Likert variable&#8217;s, a clear presentation of all the pairwise relation&#8217;s between our variable can be achieved by inspecting the (Spearman) correlation matrix (easily achieved in R by using the &#8220;cor.test&#8221; command on a matrix of variables). Yet, a challenge appears once we wish to plot this [...]]]></description>
			<content:encoded><![CDATA[<p>When analyzing a questionnaire, one often wants to view the correlation between two or more <a href="http://en.wikipedia.org/wiki/Likert_scale">Likert questionnaire</a> item&#8217;s (for example: two ordered categorical vectors ranging from 1 to 5).</p>
<p>When dealing with several such Likert variable&#8217;s, a clear presentation of all the pairwise relation&#8217;s between our variable can be achieved by inspecting the (Spearman) correlation matrix (easily achieved in R by using the &#8220;cor.test&#8221; command on a matrix of variables).<br />
Yet, a challenge appears once we wish to plot this correlation matrix.  The challenge stems from the fact that the classic presentation for a correlation matrix is a <strong>scatter plot matrix</strong> &#8211; but scatter plots don&#8217;t (usually) work well for ordered categorical vectors since the dots on the scatter plot often overlap each other.</p>
<p>There are four solution for the point-overlap problem that I know of:</p>
<ol>
<li>Jitter the data a bit to give a sense of the &#8220;density&#8221; of the points</li>
<li>Use a color spectrum to represent when a point actually represent &#8220;many points&#8221;</li>
<li>Use different points sizes to represent when there are &#8220;many points&#8221; in the location of that point</li>
<li>Add a LOWESS (or LOESS) line to the scatter plot &#8211; to show the trend of the data</li>
</ol>
<p>In this post I will offer the code for the  a solution that uses solution 3-4 (and possibly 2, please read this post comments). Here is the output (click to see a larger image):</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/04/scatter-plot-correlation-matrix.png"><img class="alignnone size-full wp-image-257" title="scatter plot correlation matrix" src="http://www.r-statistics.com/wp-content/uploads/2010/04/scatter-plot-correlation-matrix.png" alt="" width="550"/></a></p>
<p>And here is the code to produce this plot:</p>
<p><span id="more-256"></span></p>
<h3>R code for producing a Correlation scatter-plot matrix &#8211; for ordered-categorical data</h3>
<p><strong>Note</strong> that this code will work fine for continues data points (although I might suggest to enlarge the &#8220;point.size.rescale&#8221; parameter to something bigger then 1.5 in the &#8220;panel.smooth.ordered.categorical&#8221; function)</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
</pre></td><td class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #228B22;"># -----------------</span>
<span style="color: #228B22;"># Functions</span>
<span style="color: #228B22;"># -----------------</span>
&nbsp;
panel.<span style="">cor</span>.<span style="">ordered</span>.<span style="">categorical</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x, y, digits<span style="color: #080;">=</span><span style="color: #ff0000;">2</span>, prefix<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;&quot;</span>, cex.<span style="">cor</span><span style="color: #080;">&#41;</span> 
<span style="color: #080;">&#123;</span>
&nbsp;
    usr <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;usr&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">;</span> <span style="color: #0000FF; font-weight: bold;">on.<span style="">exit</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
    <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">1</span>, <span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
&nbsp;
    r <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">abs</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">cor</span><span style="color: #080;">&#40;</span>x, y, method <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;spearman&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notive we use spearman, non parametric correlation here</span>
    r.<span style="">no</span>.<span style="">abs</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cor</span><span style="color: #080;">&#40;</span>x, y, method <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;spearman&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
&nbsp;
    txt <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">format</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>r.<span style="">no</span>.<span style="">abs</span> , <span style="color: #ff0000;">0.123456789</span><span style="color: #080;">&#41;</span>, digits<span style="color: #080;">=</span>digits<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span> 
    txt <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span>prefix, txt, sep<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span> 
    <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">missing</span><span style="color: #080;">&#40;</span>cex.<span style="">cor</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> cex <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">0.8</span><span style="color: #080;">/</span><span style="color: #0000FF; font-weight: bold;">strwidth</span><span style="color: #080;">&#40;</span>txt<span style="color: #080;">&#41;</span> 
&nbsp;
    test <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cor.<span style="">test</span></span><span style="color: #080;">&#40;</span>x,y, method <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;spearman&quot;</span><span style="color: #080;">&#41;</span> 
    <span style="color: #228B22;"># borrowed from printCoefmat</span>
    Signif <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">symnum</span><span style="color: #080;">&#40;</span>test$p.<span style="">value</span>, corr <span style="color: #080;">=</span> FALSE, na <span style="color: #080;">=</span> FALSE, 
                  cutpoints <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">0.001</span>, <span style="color: #ff0000;">0.01</span>, <span style="color: #ff0000;">0.05</span>, <span style="color: #ff0000;">0.1</span>, <span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>,
                  <span style="color: #0000FF; font-weight: bold;">symbols</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;***&quot;</span>, <span style="color: #ff0000;">&quot;**&quot;</span>, <span style="color: #ff0000;">&quot;*&quot;</span>, <span style="color: #ff0000;">&quot;.&quot;</span>, <span style="color: #ff0000;">&quot; &quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
&nbsp;
    <span style="color: #0000FF; font-weight: bold;">text</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0.5</span>, <span style="color: #ff0000;">0.5</span>, txt, cex <span style="color: #080;">=</span> cex <span style="color: #080;">*</span> r<span style="color: #080;">&#41;</span> 
    <span style="color: #0000FF; font-weight: bold;">text</span><span style="color: #080;">&#40;</span>.8, .8, Signif, cex<span style="color: #080;">=</span>cex, <span style="color: #0000FF; font-weight: bold;">col</span><span style="color: #080;">=</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span> 
<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
&nbsp;
&nbsp;
panel.<span style="">smooth</span>.<span style="">ordered</span>.<span style="">categorical</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span> <span style="color: #080;">&#40;</span>x, y, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;col&quot;</span><span style="color: #080;">&#41;</span>, bg <span style="color: #080;">=</span> NA, pch <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;pch&quot;</span><span style="color: #080;">&#41;</span>, 
												cex <span style="color: #080;">=</span> <span style="color: #ff0000;">1</span>, col.<span style="">smooth</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;red&quot;</span>, span <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">/</span><span style="color: #ff0000;">3</span>, iter <span style="color: #080;">=</span> <span style="color: #ff0000;">3</span>, 
												point.<span style="">size</span>.<span style="">rescale</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.5</span>, ...<span style="color: #080;">&#41;</span> 
<span style="color: #080;">&#123;</span>
	<span style="color: #228B22;">#require(colorspace)</span>
    <span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">reshape</span><span style="color: #080;">&#41;</span>
    z <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">merge</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>x,y<span style="color: #080;">&#41;</span>, melt<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">table</span><span style="color: #080;">&#40;</span>x ,y<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">sort</span> <span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span><span style="color: #080;">&#41;</span>$value
    <span style="color: #228B22;">#the.col &lt;- heat_hcl(length(x))[z]</span>
    z <span style="color: #080;">&lt;-</span> point.<span style="">size</span>.<span style="">rescale</span><span style="color: #080;">*</span>z<span style="color: #080;">/</span> <span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notice how we rescale the dots accourding to the maximum z could have gotten</span>
&nbsp;
    <span style="color: #0000FF; font-weight: bold;">symbols</span><span style="color: #080;">&#40;</span> x, y,  circles <span style="color: #080;">=</span> z,<span style="color: #228B22;">#rep(0.1, length(x)), #sample(1:2, length(x), replace = T) ,</span>
			inches<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span>, bg<span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;grey&quot;</span>,<span style="color: #228B22;">#the.col ,</span>
			fg <span style="color: #080;">=</span> bg, add <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
&nbsp;
    <span style="color: #228B22;"># points(x, y, pch = pch, col = col, bg = bg, cex = cex)</span>
    ok <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">is.<span style="">finite</span></span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span> <span style="color: #080;">&amp;</span> <span style="color: #0000FF; font-weight: bold;">is.<span style="">finite</span></span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">if</span> <span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">any</span><span style="color: #080;">&#40;</span>ok<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
        <span style="color: #0000FF; font-weight: bold;">lines</span><span style="color: #080;">&#40;</span>stats<span style="color: #080;">::</span><span style="color: #0000FF; font-weight: bold;">lowess</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#91;</span>ok<span style="color: #080;">&#93;</span>, y<span style="color: #080;">&#91;</span>ok<span style="color: #080;">&#93;</span>, f <span style="color: #080;">=</span> span, iter <span style="color: #080;">=</span> iter<span style="color: #080;">&#41;</span>, 
            <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> col.<span style="">smooth</span>, ...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
panel.<span style="">hist</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x, ...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
    usr <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;usr&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">;</span> <span style="color: #0000FF; font-weight: bold;">on.<span style="">exit</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>usr<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>, <span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">1.5</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#41;</span>
    h <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">hist</span><span style="color: #080;">&#40;</span>x, <span style="color: #0000FF; font-weight: bold;">plot</span> <span style="color: #080;">=</span> FALSE, br <span style="color: #080;">=</span> <span style="color: #ff0000;">20</span><span style="color: #080;">&#41;</span>
    breaks <span style="color: #080;">&lt;-</span> h$breaks<span style="color: #080;">;</span> nB <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>breaks<span style="color: #080;">&#41;</span>
    y <span style="color: #080;">&lt;-</span> h$counts<span style="color: #080;">;</span> y <span style="color: #080;">&lt;-</span> y<span style="color: #080;">/</span><span style="color: #0000FF; font-weight: bold;">max</span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">rect</span><span style="color: #080;">&#40;</span>breaks<span style="color: #080;">&#91;</span><span style="color: #080;">-</span>nB<span style="color: #080;">&#93;</span>, <span style="color: #ff0000;">0</span>, breaks<span style="color: #080;">&#91;</span><span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>, y, <span style="color: #0000FF; font-weight: bold;">col</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;orange&quot;</span>, ...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
pairs.<span style="">ordered</span>.<span style="">categorical</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>xx,...<span style="color: #080;">&#41;</span>
		<span style="color: #080;">&#123;</span>
			<span style="color: #0000FF; font-weight: bold;">pairs</span><span style="color: #080;">&#40;</span>xx , 
					diag.<span style="">panel</span> <span style="color: #080;">=</span> panel.<span style="">hist</span> ,
					lower.<span style="">panel</span><span style="color: #080;">=</span>panel.<span style="">smooth</span>.<span style="">ordered</span>.<span style="">categorical</span>,
					upper.<span style="">panel</span><span style="color: #080;">=</span>panel.<span style="">cor</span>.<span style="">ordered</span>.<span style="">categorical</span>,
					cex.<span style="">labels</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.5</span>, ...<span style="color: #080;">&#41;</span> 
		<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
&nbsp;
&nbsp;
<span style="color: #228B22;"># -----------------</span>
<span style="color: #228B22;"># Example</span>
<span style="color: #228B22;"># -----------------</span>
&nbsp;
<span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">666</span><span style="color: #080;">&#41;</span>
a1 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sample</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">5</span>, <span style="color: #ff0000;">100</span>, <span style="color: #0000FF; font-weight: bold;">replace</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
a2 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sample</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">5</span>, <span style="color: #ff0000;">100</span>, <span style="color: #0000FF; font-weight: bold;">replace</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
a3 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">round</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">jitter</span><span style="color: #080;">&#40;</span>a2, <span style="color: #ff0000;">7</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#41;</span>
	a3<span style="color: #080;">&#91;</span>a3 <span style="color: #080;">&lt;</span> <span style="color: #ff0000;">1</span> <span style="color: #080;">|</span> a3 <span style="color: #080;">&gt;</span> <span style="color: #ff0000;">5</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">3</span>
a4 <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">6</span><span style="color: #080;">-</span><span style="color: #0000FF; font-weight: bold;">round</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">jitter</span><span style="color: #080;">&#40;</span>a1, <span style="color: #ff0000;">7</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#41;</span>
	a4<span style="color: #080;">&#91;</span>a4 <span style="color: #080;">&lt;</span> <span style="color: #ff0000;">1</span> <span style="color: #080;">|</span> a4 <span style="color: #080;">&gt;</span> <span style="color: #ff0000;">5</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">3</span>
&nbsp;
aa <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>a1,a2,a3, a4<span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">reshape</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># plotting :)		</span>
pairs.<span style="">ordered</span>.<span style="">categorical</span><span style="color: #080;">&#40;</span>aa<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<h3> Credits: </h3>
<ul>
<li>The original R code for the correlation matrix plot was taken from <a href="http://addictedtor.free.fr/graphiques/graphcode.php?graph=137">R Graph Gallery</a> (The differences are: 1) The use of spearman correlation;  2) The adding of hist panel and;  3) The changing of points sizes</li>
<li>The idea to use symbols for changing the point sizes was <a href="http://stackoverflow.com/questions/2593643/correlation-scatter-matrix-plot-with-different-point-size-in-r">offered</a> by <a href="http://www.linkedin.com/pub/doug-y-barbo/2/356/416">Doug Y&#8217;barbo</a>.<br />
And also to<a href="http://dirk.eddelbuettel.com/"> Dirk Eddelbuettel </a>for offering to use cex (although I ended up not using that)</li>
</ul>
<p>If you got ideas on how to improve this code (or reproducing it with ggplot2 or lattice), please do so in the comments (or on your own blog, but be sure to let me know <img src='http://www.r-statistics.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />   )</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Quantile LOESS &#8211; Combining a moving quantile window with LOESS (R function)</title>
		<link>http://www.r-statistics.com/2010/04/quantile-loess-combining-a-moving-quantile-window-with-loess-r-function/</link>
		<comments>http://www.r-statistics.com/2010/04/quantile-loess-combining-a-moving-quantile-window-with-loess-r-function/#comments</comments>
		<pubDate>Thu, 01 Apr 2010 14:27:58 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[behavioral genetics]]></category>
		<category><![CDATA[boundary estimation]]></category>
		<category><![CDATA[center estimation]]></category>
		<category><![CDATA[loess]]></category>
		<category><![CDATA[lowess]]></category>
		<category><![CDATA[moving average]]></category>
		<category><![CDATA[moving quantile]]></category>
		<category><![CDATA[outliers]]></category>
		<category><![CDATA[path data]]></category>
		<category><![CDATA[R code]]></category>
		<category><![CDATA[r function]]></category>
		<category><![CDATA[regression quantile]]></category>
		<category><![CDATA[Robustness]]></category>
		<category><![CDATA[running average]]></category>
		<category><![CDATA[running median]]></category>
		<category><![CDATA[running quantile]]></category>
		<category><![CDATA[smoother]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=222</guid>
		<description><![CDATA[In this post I will provide R code that implement&#8217;s the combination of repeated running quantile with the LOESS smoother to create a type of &#8220;quantile LOESS&#8221; (e.g: &#8220;Local Quantile Regression&#8221;). This method is useful when the need arise to fit robust and resistant (Need to be verified) a smoothed line for a quantile (an example for such a case is provided at the end of this post). If you wish to use the function in your own code, simply [...]]]></description>
			<content:encoded><![CDATA[<p>In this post I will provide R code that implement&#8217;s the combination of repeated running quantile with the LOESS smoother to create a type of &#8220;quantile LOESS&#8221; (e.g:  &#8220;Local Quantile Regression&#8221;).</p>
<p>This method is useful when the need arise to fit robust and resistant <del datetime="2010-04-05T15:44:43+00:00">(Need to be verified)</del> a smoothed line for a quantile (an example for such a case is provided at the end of this post).</p>
<p>If you wish to use the function in your own code, simply run inside your R console the following line:</p>

<div class="wp_syntax"><div class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2010/04/Quantile.loess_.r.txt&quot;</span><span style="color: #080;">&#41;</span></pre></div></div>

<h3>Background</h3>
<p>I came a cross this idea in an article titled &#8220;<a href="http://www.e-publications.org/ims/submission/index.php/AOAS/user/submissionFile/4295?confirm=37ca4b72">High throughput data analysis in behavioral genetics</a>&#8221; by Anat Sakov, Ilan Golani, Dina Lipkind and my advisor Yoav Benjamini.  From the abstract:</p>
<blockquote><p>In recent years, a growing need has arisen in different fields, for the development of computational systems for automated analysis of large amounts of data (high-throughput). Dealing with non-standard noise structure and outliers, that could have been detected and corrected in manual analysis, must now be built into the system with the aid of robust methods. [...]  we use a non-standard mix of robust and resistant methods: LOWESS and repeated running median.</p></blockquote>
<p>The motivation for this technique came from &#8220;Path data&#8221; (of mice) which is</p>
<blockquote><p>prone to suffer from noise and outliers. During progression a tracking system might lose track of the animal, inserting (occasionally very large) outliers into the data. During lingering, and even more so during arrests, outliers are rare, but the recording noise is large relative to the actual size of the movement. The statistical implications are that the two types of behavior require different degrees of smoothing and resistance. An additional complication is that the two interchange many times throughout a session. As a result, the statistical solution adopted needs not only to smooth the data, but also to recognize, adaptively, when there are arrests. To the best of our knowledge, no single existing smoothing technique has yet been able to fulfill this dual task. We elaborate on the sources of noise, and propose a mix of LOWESS (Cleveland, 1977) and the repeated running median (RRM; Tukey, 1977) to cope with these challenges</p></blockquote>
<p><strong>If all we wanted to do was to perform <a href="http://en.wikipedia.org/wiki/Moving_average">moving average</a> (running average)  on the data, using R, we could simply use the rollmean function from the <a href="http://cran.r-project.org/web/packages/zoo/index.html">zoo package</a>.</strong><br />
But since we wanted also to allow quantile smoothing, we turned to use the rollapply function.</p>
<h3>R function for performing Quantile LOESS</h3>
<p>Here is the R function that implements the LOESS smoothed repeated running quantile (with implementation for using this with a simple implementation for using average instead of quantile):</p>
<p><span id="more-222"></span></p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
</pre></td><td class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #228B22;"># This code relies on the rollapply function from the &quot;zoo&quot; package.  My thanks goes to Achim Zeileis and Gabor Grothendieck for their work on the package.</span>
Quantile.<span style="">loess</span><span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>Y, X <span style="color: #080;">=</span> NULL, 
							number.<span style="">of</span>.<span style="">splits</span> <span style="color: #080;">=</span> NULL,
							window.<span style="">size</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">20</span>,
							percent.<span style="">of</span>.<span style="">overlap</span>.<span style="">between</span>.<span style="">two</span>.<span style="">windows</span> <span style="color: #080;">=</span> NULL,
							the.<span style="">distance</span>.<span style="">between</span>.<span style="">each</span>.<span style="">window</span> <span style="color: #080;">=</span> NULL,
							the.<span style="">quant</span> <span style="color: #080;">=</span> .95,
							window.<span style="">alignment</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;center&quot;</span><span style="color: #080;">&#41;</span>, 
							window.<span style="">function</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span><span style="color: #0000FF; font-weight: bold;">quantile</span><span style="color: #080;">&#40;</span>x, the.<span style="">quant</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span>,
							<span style="color: #228B22;"># If you wish to use this with a running average instead of a running quantile, you could simply use:</span>
							<span style="color: #228B22;"># window.function = mean,</span>
							...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
	<span style="color: #228B22;"># input: Y and X, and smothing parameters</span>
	<span style="color: #228B22;"># output: new y and x</span>
&nbsp;
	<span style="color: #228B22;"># Extra parameter &quot;...&quot; goes to the loess	</span>
&nbsp;
	<span style="color: #228B22;"># window.size ==  the number of observation in the window (not the window length!)</span>
&nbsp;
	<span style="color: #228B22;"># &quot;number.of.splits&quot; will override &quot;window.size&quot;</span>
	<span style="color: #228B22;"># let's compute the window.size:	</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">null</span></span><span style="color: #080;">&#40;</span>number.<span style="">of</span>.<span style="">splits</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>window.<span style="">size</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">ceiling</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>Y<span style="color: #080;">&#41;</span><span style="color: #080;">/</span>number.<span style="">of</span>.<span style="">splits</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span>
&nbsp;
	<span style="color: #228B22;"># If the.distance.between.each.window is not specified, let's make the distances fully distinct</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">null</span></span><span style="color: #080;">&#40;</span>the.<span style="">distance</span>.<span style="">between</span>.<span style="">each</span>.<span style="">window</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>the.<span style="">distance</span>.<span style="">between</span>.<span style="">each</span>.<span style="">window</span> <span style="color: #080;">&lt;-</span> window.<span style="">size</span><span style="color: #080;">&#125;</span>
&nbsp;
	<span style="color: #228B22;"># If percent.of.overlap.between.windows is not null, it will override the.distance.between.each.window </span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">null</span></span><span style="color: #080;">&#40;</span>percent.<span style="">of</span>.<span style="">overlap</span>.<span style="">between</span>.<span style="">two</span>.<span style="">windows</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
		<span style="color: #080;">&#123;</span>
			the.<span style="">distance</span>.<span style="">between</span>.<span style="">each</span>.<span style="">window</span> <span style="color: #080;">&lt;-</span> window.<span style="">size</span> <span style="color: #080;">*</span> <span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">-</span>percent.<span style="">of</span>.<span style="">overlap</span>.<span style="">between</span>.<span style="">two</span>.<span style="">windows</span><span style="color: #080;">&#41;</span>
		<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
&nbsp;
	<span style="color: #228B22;"># loading zoo</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>zoo<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 	
	<span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;zoo is not installed - please install it.&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">install.<span style="">packages</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;zoo&quot;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">null</span></span><span style="color: #080;">&#40;</span>X<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>X <span style="color: #080;">&lt;-</span> index<span style="color: #080;">&#40;</span>Y<span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span> <span style="color: #228B22;"># if we don't have any X, then Y must be ordered, in which case, we can use the indexes of Y as X.</span>
&nbsp;
	<span style="color: #228B22;"># creating our new X and Y</span>
	zoo.<span style="">Y</span> <span style="color: #080;">&lt;-</span> zoo<span style="color: #080;">&#40;</span>x <span style="color: #080;">=</span> Y, order.<span style="">by</span> <span style="color: #080;">=</span> X<span style="color: #080;">&#41;</span>
	<span style="color: #228B22;">#zoo.X &lt;- attributes(zoo.Y)$index</span>
&nbsp;
	new.<span style="">Y</span> <span style="color: #080;">&lt;-</span> rollapply<span style="color: #080;">&#40;</span>zoo.<span style="">Y</span>, width <span style="color: #080;">=</span> window.<span style="">size</span>, 
								FUN <span style="color: #080;">=</span> window.<span style="">function</span>,
								<span style="color: #0000FF; font-weight: bold;">by</span> <span style="color: #080;">=</span> the.<span style="">distance</span>.<span style="">between</span>.<span style="">each</span>.<span style="">window</span>,
								align <span style="color: #080;">=</span> window.<span style="">alignment</span><span style="color: #080;">&#41;</span>
	new.<span style="">X</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">attributes</span><span style="color: #080;">&#40;</span>new.<span style="">Y</span><span style="color: #080;">&#41;</span>$index	
	new.<span style="">Y</span>.<span style="">loess</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">loess</span><span style="color: #080;">&#40;</span>new.<span style="">Y</span>~new.<span style="">X</span>, <span style="color: #0000FF; font-weight: bold;">family</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;sym&quot;</span>,...<span style="color: #080;">&#41;</span>$fitted 
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">list</span><span style="color: #080;">&#40;</span>y <span style="color: #080;">=</span> new.<span style="">Y</span>, x <span style="color: #080;">=</span> new.<span style="">X</span>, y.<span style="">loess</span> <span style="color: #080;">=</span> new.<span style="">Y</span>.<span style="">loess</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span></pre></td></tr></table></div>

<p>More on the math of the algorithm can be found in the <a href="http://www.e-publications.org/ims/submission/index.php/AOAS/user/submissionFile/4295?confirm=37ca4b72">original article</a>.</p>
<h3>Example: Predicting &#8220;worst case scenario&#8221; Ozone levels using temperature</h3>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/04/quantile-lowess.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/04/quantile-lowess.png" alt="" title="quantile lowess" width="550" class="alignnone size-full wp-image-227" /></a></p>
<p>The following example uses the &#8220;airquality&#8221; dataset which gives us &#8220;Daily air quality measurements in New York, May to September 1973.&#8221; With several variables, we will only look at Ozone level and Temperature.<br />
Since high Ozone levels reduces the air quality we breath, I would like to give a prediction of the predicted &#8220;worst case&#8221; Ozone level (e.g: 95% Ozone level) using to the temperature of the same day.</p>
<p>How would you try to do something like that?</p>
<p>The first solution would be to use the &#8220;rq&#8221; function from the <a href="http://cran.r-project.org/web/packages/quantreg/index.html">Quantile Regression R package</a>, but if we where to look at the data, we would see that fitting a straight line is not suitable for our data (since we have a sharp change in slope around the temperature of 80 degrees).<br />
This is a situation where Quantile LOESS (of 95%) might prove to be useful. Here is the code to produce the above plot.</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
</pre></td><td class="code"><pre class="rsplus" style="font-family:monospace;">&nbsp;
<span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">airquality</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">attach</span><span style="color: #080;">&#40;</span><span style="color: #CC9900; font-weight: bold;">airquality</span><span style="color: #080;">&#41;</span>
&nbsp;
no.<span style="">na</span> <span style="color: #080;">&lt;-</span> <span style="color: #080;">!</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">na</span></span><span style="color: #080;">&#40;</span>Ozone<span style="color: #080;">&#41;</span> <span style="color: #080;">|</span> <span style="color: #0000FF; font-weight: bold;">is.<span style="">na</span></span><span style="color: #080;">&#40;</span>Temp<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
Ozone.2 <span style="color: #080;">&lt;-</span> Ozone<span style="color: #080;">&#91;</span>no.<span style="">na</span><span style="color: #080;">&#93;</span>
Temp.2 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">jitter</span><span style="color: #080;">&#40;</span>Temp<span style="color: #080;">&#91;</span>no.<span style="">na</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span>Ozone ~ Temp, main <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;Predicting the 95% Ozone level according to Temperature&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># fitting the Quantile regression</span>
<span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>quantreg<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">abline</span><span style="color: #080;">&#40;</span>rq<span style="color: #080;">&#40;</span>Ozone ~ Temp, tau <span style="color: #080;">=</span> .95<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;red&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># fitting the Quantile LOESS</span>
<span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2010/04/Quantile.loess_.r.txt&quot;</span><span style="color: #080;">&#41;</span>
QL <span style="color: #080;">&lt;-</span> Quantile.<span style="">loess</span><span style="color: #080;">&#40;</span>Y <span style="color: #080;">=</span> Ozone.2, X <span style="color: #080;">=</span> Temp.2, 
							the.<span style="">quant</span> <span style="color: #080;">=</span> .95,
							window.<span style="">size</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">10</span>,
							window.<span style="">alignment</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;center&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">points</span><span style="color: #080;">&#40;</span>QL$y.<span style="">loess</span> ~ QL$x, type <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;l&quot;</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;green&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #0000FF; font-weight: bold;">legend</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;topleft&quot;</span>,<span style="color: #0000FF; font-weight: bold;">legend</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;95% Quantile regression&quot;</span>, <span style="color: #ff0000;">&quot;95% Quantile LOESS&quot;</span><span style="color: #080;">&#41;</span>, fill <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;red&quot;</span>,<span style="color: #ff0000;">&quot;green&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<h3>Update: I changed in the article&#8217;s name from LOWESS to LOESS</h3>
<p>After A considerate e-mail from <a href="http://dirk.eddelbuettel.com/blog/">Dirk Eddelbuettel</a> I corrected myself from using LOWESS to LOESS throughout the article. Here&#8217;s an explanation to why I did it and also why I corrected it -</p>
<p><strong>Dirk wrote to me:</strong></p>
<blockquote><p>You have a post entitled &#8216;quantile lowess&#8217; but you then (correctly) use loess.  Do you understand that there are two functions lowess() and loess()?<br />
The former is sort-of a predecessor but nobody but really old books still talks about it.  Google for (maybe) &#8216;Brian Ripley lowess loess&#8217; as he drove<br />
that point home a few times on r-help.</p></blockquote>
<p><strong>My answer was:<br />
</strong><br />
<blockquote>Thanks Dirk, [...]<br />
Regarding the loess != lowess, I noticed that this is indeed the case when I first wrote the post but I was in a predicament:<br />
On the one hand, LOESS is the more modern approach (and what I used in the script).  But on the other hand, LOWESS is what the original article&#8217;s authors where using.  I ended up deciding I would call it the way I did, but after reading what you wrote, I realized I made a mistake.<br />
I went through the article and corrected the lowess to loess, while also adding a paragraph for explain my reasoning.
</p></blockquote>
<h3>Update: regarding the method being robust</h3>
<p>After Nicholas&#8217;s comment I went checking and came across a <a href="http://tolstoy.newcastle.edu.au/R/help/01c/0614.html">R-help thread</a> by<br />
<a href="http://stat.ethz.ch/~maechler/">Martin Maechler</a> explaining how to update my code from above so that the system will be robust. Martin wrote (My notes are added in []):<br />
One gotcha [when comparing lowess to loess is]&#8211; particularly if you were used to the fact that lowess() by default is resistant to outliers {well, in many cases at least} : </p>
<ul>
<li>lowess() per default has &#8220;iter = 3&#8243; which means it uses 3 &#8220;robustifying&#8221; (also called &#8220;huberizing&#8221; for Huber (1960)) iterations .
</li>
<li>loess() on the other hand has an argument `family&#8217; with possible values &#8220;gaussian&#8221; and &#8220;symmetric&#8221; (can be abbreviated) where the *first* one is the default (unfortunately, in my opinion).</li>
</ul>
<p>I.e., loess() by default is not resistant/robust where as lowess() is. [...] I would however <strong>recommend using loess(&#8230;.., family = &#8220;sym&#8221;) routinely</strong>. </p>
<p>*  *  *</p>
<p>If you find this code useful, please let me know about it in the comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/04/quantile-loess-combining-a-moving-quantile-window-with-loess-r-function/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Nutritional supplements efficacy score &#8211; Graphing plots of current studies results (using R)</title>
		<link>http://www.r-statistics.com/2010/02/nutritional-supplements-efficacy-score-graphing-plots-of-current-studies-results-using-r/</link>
		<comments>http://www.r-statistics.com/2010/02/nutritional-supplements-efficacy-score-graphing-plots-of-current-studies-results-using-r/#comments</comments>
		<pubDate>Thu, 25 Feb 2010 21:17:07 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[allergy research supplements]]></category>
		<category><![CDATA[amino acids supplements]]></category>
		<category><![CDATA[balloon]]></category>
		<category><![CDATA[balloon plot]]></category>
		<category><![CDATA[balloon plot R]]></category>
		<category><![CDATA[barplot]]></category>
		<category><![CDATA[benefits supplements]]></category>
		<category><![CDATA[capsules supplements]]></category>
		<category><![CDATA[dietary research]]></category>
		<category><![CDATA[effects supplements]]></category>
		<category><![CDATA[fibromyalgia research]]></category>
		<category><![CDATA[glucosamine research]]></category>
		<category><![CDATA[glucosamine supplements]]></category>
		<category><![CDATA[google excel]]></category>
		<category><![CDATA[google spread sheet]]></category>
		<category><![CDATA[google spreadsheet]]></category>
		<category><![CDATA[green tea research]]></category>
		<category><![CDATA[hair loss research]]></category>
		<category><![CDATA[herbal research]]></category>
		<category><![CDATA[herbs research]]></category>
		<category><![CDATA[herbs supplements]]></category>
		<category><![CDATA[immune system supplements]]></category>
		<category><![CDATA[liquid research]]></category>
		<category><![CDATA[liquid supplements]]></category>
		<category><![CDATA[magnesium research]]></category>
		<category><![CDATA[mineral research]]></category>
		<category><![CDATA[minerals research]]></category>
		<category><![CDATA[natural health supplements]]></category>
		<category><![CDATA[natural research]]></category>
		<category><![CDATA[nutritional research]]></category>
		<category><![CDATA[plot]]></category>
		<category><![CDATA[pregnancy supplements]]></category>
		<category><![CDATA[R code]]></category>
		<category><![CDATA[side effects supplements]]></category>
		<category><![CDATA[sports nutrition supplements]]></category>
		<category><![CDATA[supplement research]]></category>
		<category><![CDATA[supplements body building]]></category>
		<category><![CDATA[supplements bodybuilding]]></category>
		<category><![CDATA[supplements dietary]]></category>
		<category><![CDATA[supplements foods]]></category>
		<category><![CDATA[supplements herbal]]></category>
		<category><![CDATA[supplements mineral]]></category>
		<category><![CDATA[supplements minerals]]></category>
		<category><![CDATA[supplements nutritional]]></category>
		<category><![CDATA[supplements products]]></category>
		<category><![CDATA[supplements protein]]></category>
		<category><![CDATA[supplements research]]></category>
		<category><![CDATA[take supplements]]></category>
		<category><![CDATA[taking supplements]]></category>
		<category><![CDATA[thyroid research]]></category>
		<category><![CDATA[vitamin b supplements]]></category>
		<category><![CDATA[vitamin c research]]></category>
		<category><![CDATA[vitamin c supplements]]></category>
		<category><![CDATA[vitamin d research]]></category>
		<category><![CDATA[vitamins discount]]></category>
		<category><![CDATA[vitamins minerals supplements]]></category>
		<category><![CDATA[vitamins research]]></category>
		<category><![CDATA[weight loss research]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=171</guid>
		<description><![CDATA[In this post I showcase a nice bar-plot and a balloon-plot listing recommended Nutritional supplements , according to how much evidence exists for thier benefits, scroll down to see it(and click here for the data behind it) * * * * The gorgeous blog &#8220;Information Is Beautiful&#8221; recently publish an eye candy post showing a “balloon race” image (see a static version of the image here) illustrating how much evidence exists for the benefits of various Nutritional supplements (such as: [...]]]></description>
			<content:encoded><![CDATA[<p>In this post I showcase a nice <strong>bar-plot and a balloon-plot listing recommended Nutritional supplements</strong> , according to how much evidence exists for thier benefits, scroll down to see it(and <a href="http://spreadsheets.google.com/ccc?key=0Aqe2P9sYhZ2ndFRKaU1FaWVvOEJiV2NwZ0JHck12X1E&amp;hl=en_GB">click here</a> for the data behind it)<br />
*  *  *  *<br />
The gorgeous blog <a href="http://www.informationisbeautiful.net/">&#8220;Information Is Beautiful&#8221;</a> recently publish an <a href="http://www.informationisbeautiful.net/play/snake-oil-supplements/">eye candy post</a> showing a “balloon race” image (see a static version of the image <a href="http://www.informationisbeautiful.net/visualizations/snake-oil-supplements/">here</a>) illustrating how much evidence exists for the benefits of various Nutritional supplements (such as: green tea, vitamins, herbs, pills and so on) . The higher the bubble in the Y axis <del datetime="2010-03-06T11:34:54+00:00">score (e.g: the bubble size)</del> for the supplement the greater the evidence there is for its effectiveness (But only for the conditions listed along side the supplement).</p>
<p>There are two reasons this should be of interest to us:</p>
<ol>
<li>This shows a fun plot, that R currently doesn&#8217;t know how to do (at least I wasn&#8217;t able to find an implementation for it). So if anyone thinks of an easy way for making one &#8211; please let me know.</li>
<li>The data for the graph is openly (and freely) provided to all of us on <a href="http://spreadsheets.google.com/ccc?key=0Aqe2P9sYhZ2ndFRKaU1FaWVvOEJiV2NwZ0JHck12X1E&amp;hl=en_GB">this Google Doc</a>.</li>
</ol>
<p>The advantage of having the data on a google doc means that we can see when the data will be updated. But more then that, it means we can easily extract the data into R and have our way with it  (Thanks to <a href="http://blog.revolution-computing.com/2009/09/how-to-use-a-google-spreadsheet-as-data-in-r.html">David Smith&#8217;s post </a>on the subject)</p>
<p>For example, I was wondering what are ALL of the top recommended Nutritional supplements, an answer that is not trivial to get from the plot that was in the <a href="http://www.informationisbeautiful.net/play/snake-oil-supplements/">original post</a>.</p>
<p>In this post I will supply two plots that present the data: A barplot (that in retrospect didn&#8217;t prove to be good enough) and a balloon-plot for a table (that seems to me to be much better).</p>
<p><strong>Barplot</strong><br />
(You can <strong>click the image to enlarge</strong> it)<br />
<a href="http://www.r-statistics.com/wp-content/uploads/2010/02/Nutritional-supplements-efficacy.png"><img class="alignnone size-full wp-image-172" title="Nutritional supplements efficacy" src="http://www.r-statistics.com/wp-content/uploads/2010/02/Nutritional-supplements-efficacy.png" alt="" width="550" /></a></p>
<p>The R code to produce the barplot of Nutritional supplements efficacy score (by evidence for its effectiveness on the listed condition).</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
</pre></td><td class="code"><pre class="rsplus" style="font-family:monospace;">&nbsp;
<span style="color: #228B22;"># loading the data</span>
supplements.<span style="">data</span>.0 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">read.<span style="">csv</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://spreadsheets.google.com/pub?key=0Aqe2P9sYhZ2ndFRKaU1FaWVvOEJiV2NwZ0JHck12X1E&amp;output=csv&quot;</span><span style="color: #080;">&#41;</span>
supplements.<span style="">data</span> <span style="color: #080;">&lt;-</span> supplements.<span style="">data</span>.0<span style="color: #080;">&#91;</span>supplements.<span style="">data</span>.0<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&gt;</span><span style="color: #ff0000;">2</span>,<span style="color: #080;">&#93;</span> <span style="color: #228B22;"># let's only look at &quot;good&quot; supplements</span>
supplements.<span style="">data</span> <span style="color: #080;">&lt;-</span> supplements.<span style="">data</span><span style="color: #080;">&#91;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">na</span></span><span style="color: #080;">&#40;</span>supplements.<span style="">data</span><span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>,<span style="color: #080;">&#93;</span> <span style="color: #228B22;"># and we don't want any missing data</span>
&nbsp;
supplement.<span style="">score</span> <span style="color: #080;">&lt;-</span> supplements.<span style="">data</span><span style="color: #080;">&#91;</span>, <span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>
ss <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span>supplement.<span style="">score</span>, decreasing  <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">F</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># sort our data</span>
supplement.<span style="">score</span> <span style="color: #080;">&lt;-</span> supplement.<span style="">score</span><span style="color: #080;">&#91;</span>ss<span style="color: #080;">&#93;</span>
supplement.<span style="">name</span> <span style="color: #080;">&lt;-</span> supplements.<span style="">data</span><span style="color: #080;">&#91;</span>ss, <span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
supplement.<span style="">benefits</span> <span style="color: #080;">&lt;-</span> supplements.<span style="">data</span><span style="color: #080;">&#91;</span>ss, <span style="color: #ff0000;">4</span><span style="color: #080;">&#93;</span>
supplement.<span style="">score</span>.<span style="">col</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">factor</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">as.<span style="">character</span></span><span style="color: #080;">&#40;</span>supplement.<span style="">score</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">levels</span><span style="color: #080;">&#40;</span>supplement.<span style="">score</span>.<span style="">col</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&lt;-</span>  <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;red&quot;</span>, <span style="color: #ff0000;">&quot;orange&quot;</span>, <span style="color: #ff0000;">&quot;blue&quot;</span>, <span style="color: #ff0000;">&quot;dark green&quot;</span><span style="color: #080;">&#41;</span>
	supplement.<span style="">score</span>.<span style="">col</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">as.<span style="">character</span></span><span style="color: #080;">&#40;</span>supplement.<span style="">score</span>.<span style="">col</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># mar: c(bottom, left, top, right) The default is c(5, 4, 4, 2) + 0.1.</span>
<span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>mar <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">5</span>,<span style="color: #ff0000;">9</span>,<span style="color: #ff0000;">4</span>,<span style="color: #ff0000;">13</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># taking care of the plot margins</span>
bar.<span style="">y</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">barplot</span><span style="color: #080;">&#40;</span>supplement.<span style="">score</span>, names.<span style="">arg</span><span style="color: #080;">=</span> supplement.<span style="">name</span>, las <span style="color: #080;">=</span> <span style="color: #ff0000;">1</span>, horiz <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> supplement.<span style="">score</span>.<span style="">col</span>, xlim <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,<span style="color: #ff0000;">6.2</span><span style="color: #080;">&#41;</span>,
				main <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Nutritional supplements efficacy score&quot;</span>,<span style="color: #ff0000;">&quot;(by evidence for its effectiveness on the listed condition)&quot;</span>, <span style="color: #ff0000;">&quot;(2010)&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">axis</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">4</span>, <span style="color: #0000FF; font-weight: bold;">labels</span> <span style="color: #080;">=</span> supplement.<span style="">benefits</span>, at <span style="color: #080;">=</span> bar.<span style="">y</span>, las <span style="color: #080;">=</span> <span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># Add right axis</span>
<span style="color: #0000FF; font-weight: bold;">abline</span><span style="color: #080;">&#40;</span>h <span style="color: #080;">=</span> bar.<span style="">y</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> supplement.<span style="">score</span>.<span style="">col</span> , lty <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># add some lines so to easily follow each bar</span></pre></td></tr></table></div>

<p>Also, the nice things is that if the guys at Information Is Beautiful will update there data, we could easily run the code and see the updated list of recommended supplements.</p>
<p><strong>Balloon plot</strong><br />
So after some web surfing I came around an implementation of a balloon plot in R (Thanks to <a href="http://addictedtor.free.fr/graphiques/graphcode.php?graph=60">R graph gallery</a>)<br />
There where two problems with using the command out of the box. The first one was that the colors where non informative (easily fixed), the second one was that the X labels where overlapping one another. Since there is no &#8220;las&#8221; parameter in the function, I just opened the function up, found where this was plotted and changed it manually (a bit messy, but that&#8217;s what you have to do sometimes&#8230;)</p>
<p>Here are the result (you can click the image for a larger image):</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/02/balloonplot.png"><img src="http://www.r-statistics.com/wp-content/uploads/2010/02/balloonplot.png" alt="" title="balloonplot" width="550" class="alignnone size-full wp-image-199" /></a></p>
<p>And here is The R code to produce the Balloon plot of Nutritional supplements efficacy score (by evidence for its effectiveness on the listed condition).<br />
 (it&#8217;s just the copy of the function with a tiny bit of editing in line 146, and then using it)</p>
<p><span id="more-171"></span></p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
</pre></td><td class="code"><pre class="rsplus" style="font-family:monospace;">&nbsp;
<span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>colorspace<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>gplots<span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># I was able to find the function by using</span>
<span style="color: #228B22;"># methods(balloonplot)[1]</span>
<span style="color: #228B22;"># This command: getAnywhere(&quot;balloonplot.default&quot;) # Wouldn't work...</span>
balloonplot2 <span style="color: #080;">&lt;-</span> gplots<span style="color: #080;">:::</span><span style="">balloonplot</span>.<span style="">default</span> <span style="color: #228B22;"># This one works :)</span>
&nbsp;
<span style="color: #228B22;"># now run:</span>
<span style="color: #0000FF; font-weight: bold;">fix</span><span style="color: #080;">&#40;</span>balloonplot2<span style="color: #080;">&#41;</span>
<span style="color: #228B22;"># search for </span>
<span style="color: #228B22;"># y &lt;- ny + 0.75 + (nlabels.x - i + 0.5) * colmar</span>
<span style="color: #228B22;"># And add beneath it the following line:</span>
<span style="color: #228B22;"># y &lt;- rep(y, dim(xlabs)[1]) - c(0,.5,1)</span>
&nbsp;
supplement.<span style="">benefits</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">tolower</span><span style="color: #080;">&#40;</span>supplement.<span style="">benefits</span> <span style="color: #080;">&#41;</span>
supplement.<span style="">name</span>		<span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">tolower</span><span style="color: #080;">&#40;</span>supplement.<span style="">name</span><span style="color: #080;">&#41;</span>
&nbsp;
balloonplot2<span style="color: #080;">&#40;</span> supplement.<span style="">name</span>,supplement.<span style="">benefits</span>, supplement.<span style="">score</span>, xlab <span style="color: #080;">=</span><span style="color: #ff0000;">&quot;supplement&quot;</span>, ylab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Benefit&quot;</span>,
			show.<span style="">margins</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span>, dotsize <span style="color: #080;">=</span> <span style="color: #ff0000;">15</span>,fun<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #0000FF; font-weight: bold;">max</span><span style="color: #080;">&#40;</span>x,na.<span style="">rm</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>,			
			rowmar <span style="color: #080;">=</span> <span style="color: #ff0000;">7</span>,
			colmar <span style="color: #080;">=</span> <span style="color: #ff0000;">7</span>,
			dotcolor <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">rev</span><span style="color: #080;">&#40;</span>heat_hcl<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">max</span><span style="color: #080;">&#40;</span> supplement.<span style="">score</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span> supplement.<span style="">score</span><span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>,
			main <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Balloon plot of&quot;</span>, <span style="color: #ff0000;">&quot;Nutritional supplements efficacy score&quot;</span>,<span style="color: #ff0000;">&quot;(by evidence for its effectiveness on the listed condition)&quot;</span>, <span style="color: #ff0000;">&quot;(2010)&quot;</span><span style="color: #080;">&#41;</span>,
			<span style="color: #0000FF; font-weight: bold;">sub</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Published on www.r-statistics.com&quot;</span><span style="color: #080;">&#41;</span>				
			<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>Got any good ideas of how else to plot the data? let me know in the comments <img src='http://www.r-statistics.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/02/nutritional-supplements-efficacy-score-graphing-plots-of-current-studies-results-using-r/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
		<item>
		<title>Siegel-Tukey: a Non-parametric test for equality in variability (R code)</title>
		<link>http://www.r-statistics.com/2010/02/siegel-tukey-a-non-parametric-test-for-equality-in-variability-r-code/</link>
		<comments>http://www.r-statistics.com/2010/02/siegel-tukey-a-non-parametric-test-for-equality-in-variability-r-code/#comments</comments>
		<pubDate>Mon, 22 Feb 2010 21:13:51 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[non-parametric]]></category>
		<category><![CDATA[non-parametric test]]></category>
		<category><![CDATA[nonparametric]]></category>
		<category><![CDATA[nonparametric test]]></category>
		<category><![CDATA[R code]]></category>
		<category><![CDATA[Siegel]]></category>
		<category><![CDATA[Siegel-Tukey]]></category>
		<category><![CDATA[Tukey]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=161</guid>
		<description><![CDATA[Daniel Malter just shared on the R mailing list (link to the thread) his code for performing the Siegel-Tukey (Nonparametric) test for equality in variability. Excited about the find, I contacted Daniel asking if I could republish his code here, and he kindly replied &#8220;yes&#8221;. From here on I copy his note at full. p.s: (The R function can be downloaded from here) * * * * Hi, I recently ran into the problem that I needed a Siegel-Tukey test [...]]]></description>
			<content:encoded><![CDATA[<p>Daniel Malter just shared on the R mailing list (<a href="http://n4.nabble.com/Siegel-Tukey-test-for-equal-variability-code-td1565053.html">link to the thread</a>) his code for performing the Siegel-Tukey (Nonparametric) test for equality in variability.<br />
Excited about the find, I contacted Daniel asking if I could republish his code here, and he kindly replied &#8220;yes&#8221;.<br />
From here on I copy his note at full.</p>
<p>p.s: (The R function can be <a href="http://www.r-statistics.com/wp-content/uploads/2010/02/siegel-tukey-non-parametric-test-for-equal-variance.r.txt">downloaded from here</a>)</p>
<p>*  *  *  *<br />
<span id="more-161"></span></p>
<p>Hi, I recently ran into the problem that I needed a Siegel-Tukey test for equal variability based on ranks. Maybe there is a package that has it implemented, but I could not find it. So I programmed an R function to do it. The Siegel-Tukey test requires to recode the ranks so that they express variability rather than ascending order. This is essentially what the code further below does. After the rank  transformation, a regular Mann-Whitney U test is applied. The &#8220;manual&#8221; and code are pasted below.</p>
<p><strong><span style="text-decoration: underline;">Description</span></strong>:  Non-parametric Siegel-Tukey test for equality in variability. The null hypothesis is that the variability of x is equal between two groups. A rejection of the null indicates that variability differs between<br />
the two groups.</p>
<p><strong><span style="text-decoration: underline;">Usage:</span></strong></p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
</pre></td><td class="code"><pre class="rsplus" style="font-family:monospace;">siegel.<span style="">tukey</span><span style="color: #080;">&#40;</span>x,y,id.<span style="">col</span><span style="color: #080;">=</span>FALSE,adjust.<span style="">median</span><span style="color: #080;">=</span>FALSE,rnd<span style="color: #080;">=</span><span style="color: #ff0000;">8</span>, ...<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p><strong><span style="text-decoration: underline;">Arguments:</span></strong></p>
<p>x: a vector of data</p>
<p>y: Data of the second group (if id.col=FALSE) or group indicator (if id.col=TRUE). In the latter case, y MUST take 1 or 2 to indicate observations of group 1 and 2, respectively, and x must contain the data for both groups.</p>
<p>id.col: If FALSE (default), then x and y are the data columns for group 1 and 2, respectively. If TRUE, the y is the group indicator.</p>
<p>adjust.median: Should between-group differences in medians be leveled before performing the test? In certain cases, the Siegel-Tukey test is susceptible to median differences and may indicate significant differences in variability that, in reality, stem from differences in medians.</p>
<p>rnd: Should the data be rounded and, if so, to which decimal? The default (-1) uses the data as is. Otherwise, rnd must be a non-negative integer. Typically, this option is not needed. However, occasionally, differences in<br />
the precision with which certain functions return values cause the merging of two data frames to fail within the siegel.tukey function. Only then  rounding is necessary. This operation should not be performed if it affects<br />
the ranks of observations.</p>
<p>&#8230; arguments passed on to the Wilcoxon test. See ?wilcox.test</p>
<p><strong><span style="text-decoration: underline;">Value</span></strong>: Among other output, the function returns rank sums for the two groups, the associated Wilcoxon&#8217;s W, and the p-value for a Wilcoxon test on tie-adjusted Siegel-Tukey ranks (i.e., it performs and returns a<br />
Siegel-Tukey test). If significant, the group with the smaller rank sum has greater variability.</p>
<p><strong><span style="text-decoration: underline;">References</span></strong>: Sidney Siegel and John Wilder Tukey (1960) &#8220;A nonparametric sum of ranks procedure for relative spread in unpaired samples.&#8221; Journal of the<br />
American Statistical Association. See also, David J. Sheskin (2004) &#8221;Handbook of parametric and nonparametric statistical procedures.&#8221; 3rd<br />
edition. Chapman and Hall/CRC. Boca Raton, FL.</p>
<p><strong><span style="text-decoration: underline;">Notes</span></strong>: The Siegel-Tukey test has relatively low power and may, under certain conditions, indicate significance due to differences in medians rather than<br />
differences in variabilities (consider using the argument adjust.median).</p>
<p><strong><span style="text-decoration: underline;">Output</span></strong> (in this order)</p>
<p style="padding-left: 30px;">1. Group medians<br />
2. Wilcoxon-test for between-group differences in median (after the median<br />
adjustment if specified)<br />
3. Unique values of x and their tie-adjusted Siegel-Tukey ranks<br />
4. Xs of group 1 and their tie-adjusted Siegel-Tukey ranks<br />
5. Xs of group 2 and their tie-adjusted Siegel-Tukey ranks<br />
6. Siegel-Tukey test (Wilcoxon test on tie-adjusted Siegel-Tukey ranks)</p>
<p><strong>And here is the code:</strong></p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
</pre></td><td class="code"><pre class="rsplus" style="font-family:monospace;">siegel.<span style="">tukey</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x,y,id.<span style="">col</span><span style="color: #080;">=</span>FALSE,adjust.<span style="">median</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span>,rnd<span style="color: #080;">=-</span><span style="color: #ff0000;">1</span>,alternative<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;two.sided&quot;</span>,mu<span style="color: #080;">=</span><span style="color: #ff0000;">0</span>,paired<span style="color: #080;">=</span>FALSE,exact<span style="color: #080;">=</span>FALSE,correct<span style="color: #080;">=</span>TRUE,conf.<span style="">int</span><span style="color: #080;">=</span>FALSE,conf.<span style="">level</span><span style="color: #080;">=</span><span style="color: #ff0000;">0.95</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
 <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>id.<span style="">col</span><span style="color: #080;">==</span>FALSE<span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
   <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>x,y<span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
   <span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>
	<span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>x,y<span style="color: #080;">&#41;</span>
   <span style="color: #080;">&#125;</span>
 <span style="color: #0000FF; font-weight: bold;">names</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#41;</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;x&quot;</span>,<span style="color: #ff0000;">&quot;y&quot;</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">order</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#41;</span>,<span style="color: #080;">&#93;</span>
 <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>rnd<span style="color: #080;">&gt;-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">round</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x,rnd<span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span>
&nbsp;
 <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>adjust.<span style="">median</span><span style="color: #080;">==</span><span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
	<span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">-</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">median</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">-</span><span style="color: #0000FF; font-weight: bold;">median</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #ff0000;">2</span>
	<span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">-</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">median</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">-</span><span style="color: #0000FF; font-weight: bold;">median</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #ff0000;">2</span>
 <span style="color: #080;">&#125;</span>
 <span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Median of group 1 = &quot;</span>,<span style="color: #0000FF; font-weight: bold;">median</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Median of group 2 = &quot;</span>,<span style="color: #0000FF; font-weight: bold;">median</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Test of median differences&quot;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">wilcox.<span style="">test</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>,<span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span>y<span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
 a<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">seq</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">ceiling</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #ff0000;">4</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,each<span style="color: #080;">=</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span>
 b<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">ceiling</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #ff0000;">4</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
 rk.<span style="">up</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,<span style="color: #080;">&#40;</span>a<span style="color: #080;">*</span><span style="color: #ff0000;">4</span><span style="color: #080;">+</span>b<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #0000FF; font-weight: bold;">ceiling</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span>
 rk.<span style="">down</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rev</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>a<span style="color: #080;">*</span><span style="color: #ff0000;">4</span><span style="color: #080;">+</span>b<span style="color: #080;">-</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #0000FF; font-weight: bold;">floor</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
&nbsp;
 <span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Performing Siegel-Tukey rank transformation...&quot;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
 rks<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>rk.<span style="">up</span>,rk.<span style="">down</span><span style="color: #080;">&#41;</span>
 unqs<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">unique</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">sort</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
 corr.<span style="">rks</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">tapply</span><span style="color: #080;">&#40;</span>rks,<span style="color: #0000FF; font-weight: bold;">data</span>$x,<span style="color: #0000FF; font-weight: bold;">mean</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span>unqs,corr.<span style="">rks</span><span style="color: #080;">&#41;</span>
 rks.<span style="">data</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>unqs,corr.<span style="">rks</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">names</span><span style="color: #080;">&#40;</span>rks.<span style="">data</span><span style="color: #080;">&#41;</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;unique values of x&quot;</span>,<span style="color: #ff0000;">&quot;tie-adjusted Siegel-Tukey rank&quot;</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span>rks.<span style="">data</span>,<span style="color: #0000FF; font-weight: bold;">row.<span style="">names</span></span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">names</span><span style="color: #080;">&#40;</span>rks.<span style="">data</span><span style="color: #080;">&#41;</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;unqs&quot;</span>,<span style="color: #ff0000;">&quot;corr.rks&quot;</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">merge</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>,rks.<span style="">data</span>,by.<span style="">x</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;x&quot;</span>,by.<span style="">y</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;unqs&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
 rk1<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">data</span>$corr.<span style="">rks</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
 rk2<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">data</span>$corr.<span style="">rks</span><span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>
 <span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span>,<span style="color: #ff0000;">&quot;Tie-adjusted Siegel-Tukey ranks of group 1&quot;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span>
 group1<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>,rk1<span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">names</span><span style="color: #080;">&#40;</span>group1<span style="color: #080;">&#41;</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;x&quot;</span>,<span style="color: #ff0000;">&quot;rank&quot;</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span>group1,<span style="color: #0000FF; font-weight: bold;">row.<span style="">names</span></span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span>,<span style="color: #ff0000;">&quot;Tie-adjusted Siegel-Tukey ranks of group 2&quot;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span>
 group2<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>$x<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">data</span>$y<span style="color: #080;">==</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>,rk2<span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">names</span><span style="color: #080;">&#40;</span>group2<span style="color: #080;">&#41;</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;x&quot;</span>,<span style="color: #ff0000;">&quot;rank&quot;</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span>group2,<span style="color: #0000FF; font-weight: bold;">row.<span style="">names</span></span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
 <span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Siegel-Tukey test&quot;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Siegel-Tukey rank transformation performed.&quot;</span>,<span style="color: #ff0000;">&quot;Tie adjusted ranks computed.&quot;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span>
 <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>adjust.<span style="">median</span><span style="color: #080;">==</span><span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span><span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Medians adjusted to equality.&quot;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span><span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Medians not adjusted.&quot;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span>
 <span style="color: #0000FF; font-weight: bold;">cat</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Rank sum of group 1 =&quot;</span>, <span style="color: #0000FF; font-weight: bold;">sum</span><span style="color: #080;">&#40;</span>rk1<span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">&quot;    Rank sum of group 2 =&quot;</span>,<span style="color: #0000FF; font-weight: bold;">sum</span><span style="color: #080;">&#40;</span>rk2<span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
&nbsp;
 <span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">wilcox.<span style="">test</span></span><span style="color: #080;">&#40;</span>rk1,rk2,alternative<span style="color: #080;">=</span>alternative,mu<span style="color: #080;">=</span>mu,paired<span style="color: #080;">=</span>paired,exact<span style="color: #080;">=</span>exact,correct<span style="color: #080;">=</span>correct,conf.<span style="">int</span><span style="color: #080;">=</span>conf.<span style="">int</span>,conf.<span style="">level</span><span style="color: #080;">=</span>conf.<span style="">level</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
<span style="color: #228B22;">#Example:</span>
&nbsp;
x<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">4</span>,<span style="color: #ff0000;">4</span>,<span style="color: #ff0000;">5</span>,<span style="color: #ff0000;">5</span>,<span style="color: #ff0000;">6</span>,<span style="color: #ff0000;">6</span><span style="color: #080;">&#41;</span>
y<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,<span style="color: #ff0000;">0</span>,<span style="color: #ff0000;">1</span>,<span style="color: #ff0000;">9</span>,<span style="color: #ff0000;">10</span>,<span style="color: #ff0000;">10</span><span style="color: #080;">&#41;</span>
&nbsp;
siegel.<span style="">tukey</span><span style="color: #080;">&#40;</span>x,y<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<pre>
<strong>Here is the code output:</strong>
<div id="_mcePaste" style="padding-left: 30px;">Median of group 1 =  5</div>
<div id="_mcePaste" style="padding-left: 30px;">Median of group 2 =  5</div>
<div id="_mcePaste" style="padding-left: 30px;">Test of median differences</div>
<div id="_mcePaste" style="padding-left: 30px;">Wilcoxon rank sum test with continuity correction</div>
<div id="_mcePaste" style="padding-left: 30px;">data:  data$x[data$y == 1] and data$x[data$y == y]</div>
<div id="_mcePaste" style="padding-left: 30px;">W = 1, p-value = 0.4274</div>
<div id="_mcePaste" style="padding-left: 30px;">alternative hypothesis: true location shift is not equal to 0</div>
<div id="_mcePaste" style="padding-left: 30px;">Performing Siegel-Tukey rank transformation...</div>
<div id="_mcePaste" style="padding-left: 30px;">unique values of x tie-adjusted Siegel-Tukey rank</div>
<div id="_mcePaste" style="padding-left: 30px;">0                            2.5</div>
<div id="_mcePaste" style="padding-left: 30px;">1                            5.0</div>
<div id="_mcePaste" style="padding-left: 30px;">4                            8.5</div>
<div id="_mcePaste" style="padding-left: 30px;">5                           11.5</div>
<div id="_mcePaste" style="padding-left: 30px;">6                            8.5</div>
<div id="_mcePaste" style="padding-left: 30px;">9                            6.0</div>
<div id="_mcePaste" style="padding-left: 30px;">10                            2.5</div>
<div id="_mcePaste" style="padding-left: 30px;">Tie-adjusted Siegel-Tukey ranks of group 1</div>
<div id="_mcePaste" style="padding-left: 30px;">x rank</div>
<div id="_mcePaste" style="padding-left: 30px;">4  8.5</div>
<div id="_mcePaste" style="padding-left: 30px;">4  8.5</div>
<div id="_mcePaste" style="padding-left: 30px;">5 11.5</div>
<div id="_mcePaste" style="padding-left: 30px;">5 11.5</div>
<div id="_mcePaste" style="padding-left: 30px;">6  8.5</div>
<div id="_mcePaste" style="padding-left: 30px;">6  8.5</div>
<div id="_mcePaste" style="padding-left: 30px;">Tie-adjusted Siegel-Tukey ranks of group 2</div>
<div id="_mcePaste" style="padding-left: 30px;">x rank</div>
<div id="_mcePaste" style="padding-left: 30px;">0  2.5</div>
<div id="_mcePaste" style="padding-left: 30px;">0  2.5</div>
<div id="_mcePaste" style="padding-left: 30px;">1  5.0</div>
<div id="_mcePaste" style="padding-left: 30px;">9  6.0</div>
<div id="_mcePaste" style="padding-left: 30px;">10  2.5</div>
<div id="_mcePaste" style="padding-left: 30px;">10  2.5</div>
<div id="_mcePaste" style="padding-left: 30px;">Siegel-Tukey test</div>
<div id="_mcePaste" style="padding-left: 30px;">Siegel-Tukey rank transformation performed. Tie adjusted ranks computed.</div>
<div id="_mcePaste" style="padding-left: 30px;">Medians not adjusted.</div>
<div id="_mcePaste" style="padding-left: 30px;">Rank sum of group 1 = 57     Rank sum of group 2 = 21</div>
<div id="_mcePaste" style="padding-left: 30px;">Wilcoxon rank sum test with continuity correction</div>
<div id="_mcePaste" style="padding-left: 30px;">data:  rk1 and rk2</div>
<div id="_mcePaste" style="padding-left: 30px;">W = 36, p-value = 0.003601</div>
<div id="_mcePaste" style="padding-left: 30px;">alternative hypothesis: true location shift is not equal to 0</div>
<div id="_mcePaste" style="padding-left: 30px;">Warning message:</div>
<div id="_mcePaste" style="padding-left: 30px;">In wilcox.test.default(data$x[data$y == 1], data$x[data$y == y]) :</div>
<div id="_mcePaste" style="padding-left: 30px;">cannot compute exact p-value with ties</div>
<p style="padding-left: 30px;">Median of group 1 =  5 Median of group 2 =  5  Test of median differences
Wilcoxon rank sum test with continuity correction
data:  data$x[data$y == 1] and data$x[data$y == y] W = 1, p-value = 0.4274alternative hypothesis: true location shift is not equal to 0
Performing Siegel-Tukey rank transformation...   unique values of x tie-adjusted Siegel-Tukey rank                  0                            2.5                  1                            5.0                  4                            8.5                  5                           11.5                  6                            8.5                  9                            6.0                 10                            2.5
Tie-adjusted Siegel-Tukey ranks of group 1  x rank 4  8.5 4  8.5 5 11.5 5 11.5 6  8.5 6  8.5
Tie-adjusted Siegel-Tukey ranks of group 2   x rank  0  2.5  0  2.5  1  5.0  9  6.0 10  2.5 10  2.5
Siegel-Tukey test Siegel-Tukey rank transformation performed. Tie adjusted ranks computed. Medians not adjusted. Rank sum of group 1 = 57     Rank sum of group 2 = 21
Wilcoxon rank sum test with continuity correction
data:  rk1 and rk2 W = 36, p-value = 0.003601alternative hypothesis: true location shift is not equal to 0
Warning message:In wilcox.test.default(data$x[data$y == 1], data$x[data$y == y]) :  cannot compute exact p-value with ties
</pre>
<p>(The R function can be <a href="http://www.r-statistics.com/wp-content/uploads/2010/02/siegel-tukey-non-parametric-test-for-equal-variance.r.txt">downloaded from here</a>)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/02/siegel-tukey-a-non-parametric-test-for-equality-in-variability-r-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Post hoc analysis for Friedman&#8217;s Test  (R code)</title>
		<link>http://www.r-statistics.com/2010/02/post-hoc-analysis-for-friedmans-test-r-code/</link>
		<comments>http://www.r-statistics.com/2010/02/post-hoc-analysis-for-friedmans-test-r-code/#comments</comments>
		<pubDate>Mon, 22 Feb 2010 09:08:14 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[ANOVA]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[friedman test]]></category>
		<category><![CDATA[friedman's test]]></category>
		<category><![CDATA[multiple comparisons]]></category>
		<category><![CDATA[nonparametric]]></category>
		<category><![CDATA[nonparametric test]]></category>
		<category><![CDATA[one way anova]]></category>
		<category><![CDATA[post hoc]]></category>
		<category><![CDATA[post hoc analysis]]></category>
		<category><![CDATA[posthoc]]></category>
		<category><![CDATA[R code]]></category>
		<category><![CDATA[repeated measures]]></category>
		<category><![CDATA[repeated measures anova]]></category>
		<category><![CDATA[test]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=150</guid>
		<description><![CDATA[My goal in this post is to give an overview of Friedman&#8217;s Test and then offer R code to perform post hoc analysis on Friedman&#8217;s Test results. (The R function can be downloaded from here) Preface: What is Friedman&#8217;s Test Friedman test is a non-parametric randomized block analysis of variance. Which is to say it is a non-parametric version of a one way ANOVA with repeated measures. That means that while a simple ANOVA test requires the assumptions of a [...]]]></description>
			<content:encoded><![CDATA[<p>My goal in this post is to give an overview of Friedman&#8217;s Test and then offer R code to perform post hoc analysis on Friedman&#8217;s Test results. (The R function can be <a href="http://www.r-statistics.com/wp-content/uploads/2010/02/Friedman-Test-with-Post-Hoc.r.txt">downloaded from here</a>)</p>
<h3>Preface: What is Friedman&#8217;s Test</h3>
<p><strong>Friedman test</strong> is a non-parametric randomized block analysis of variance.  Which is to say it is a non-parametric version of a one way ANOVA with repeated measures. That means that while a simple ANOVA test requires the assumptions of a normal distribution and equal variances (of the residuals), the Friedman test is free from those restriction. The price of this parametric freedom is the loss of power (of Friedman&#8217;s test compared to the parametric ANOVa versions).</p>
<p>The hypotheses for the comparison across repeated measures are:</p>
<ul>
<li>H0: The distributions (whatever they are) are the same across repeated measures</li>
<li>H1: The distributions across repeated measures are different</li>
</ul>
<p>The test statistic for the Friedman&#8217;s test is a Chi-square with [(number of repeated measures)-1] degrees of freedom. A detailed explanation of the method for computing the Friedman test is available <a href="http://en.wikipedia.org/wiki/Friedman_test">on Wikipedia</a>.</p>
<p><strong>Performing Friedman&#8217;s Test in R</strong> is very simple, and is by using the &#8220;friedman.test&#8221; command.</p>
<h3>Post hoc analysis for the Friedman&#8217;s Test</h3>
<p>Assuming you performed Friedman&#8217;s Test and found a significant P value, that means that some of the groups in your data have different distribution from one another, but you don&#8217;t (yet) know which. Therefor, our next step will be to try and find out which pairs of our groups are significantly different then each other. But when we have N groups, checking all of their pairs will be to perform [n over 2] comparisons, thus the need to correct for multiple comparisons arise.<br />
<strong>The tasks:</strong><br />
<strong>Our first task</strong> will be to perform a post hoc analysis of our results (using R) &#8211; in the hope of finding out which of our groups are responsible that we found that the null hypothesis was rejected. While in the simple case of ANOVA, an R command is readily available (&#8220;TukeyHSD&#8221;), in the case of friedman&#8217;s test (until now) the code to perform the post hoc test was not as easily accessible.<br />
<strong>Our second task</strong> will be to visualize our results. While in the case of simple ANOVA, a boxplot of each group is sufficient, in the case of a repeated measures &#8211; a boxplot approach will be misleading to the viewer. Instead, we will offer two plots: one of parallel coordinates, and the other will be boxplots of the differences between all pairs of groups (in this respect, the post hoc analysis can be thought of as performing paired wilcox.test with correction for multiplicity).</p>
<h3>R code for Post hoc analysis for the Friedman&#8217;s Test</h3>
<p>The analysis will be performed using the function (I wrote) called &#8220;friedman.test.with.post.hoc&#8221;, based on the packages &#8220;coin&#8221; and &#8220;multcomp&#8221;. Just a few words about it&#8217;s arguments:</p>
<ul>
<li>formu &#8211; is a formula object of the shape: 	Y ~ X | block (where Y is the ordered (numeric) responce, X is a group indicator (factor), and block is the block (or subject) indicator (factor)</li>
<li>data &#8211; is a data frame with columns of Y, X and block (the names could be different, of course, as long as the formula given in &#8220;formu&#8221; represent that)</li>
<li>All the other parameters are to allow or suppress plotting of the results.</li>
</ul>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
</pre></td><td class="code"><pre class="rsplus" style="font-family:monospace;">friedman.<span style="">test</span>.<span style="">with</span>.<span style="">post</span>.<span style="">hoc</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>formu, <span style="color: #0000FF; font-weight: bold;">data</span>, to.<span style="">print</span>.<span style="">friedman</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span>, to.<span style="">post</span>.<span style="">hoc</span>.<span style="">if</span>.<span style="">signif</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span>,  to.<span style="">plot</span>.<span style="">parallel</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span>, to.<span style="">plot</span>.<span style="">boxplot</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span>, signif.<span style="">P</span> <span style="color: #080;">=</span> .05, color.<span style="">blocks</span>.<span style="">in</span>.<span style="">cor</span>.<span style="">plot</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span>, jitter.<span style="">Y</span>.<span style="">in</span>.<span style="">cor</span>.<span style="">plot</span> <span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
	<span style="color: #228B22;"># formu is a formula of the shape: 	Y ~ X | block</span>
	<span style="color: #228B22;"># data is a long data.frame with three columns:    [[ Y (numeric), X (factor), block (factor) ]]</span>
&nbsp;
	<span style="color: #228B22;"># Note: This function doesn't handle NA's! In case of NA in Y in one of the blocks, then that entire block should be removed.</span>
&nbsp;
&nbsp;
	<span style="color: #228B22;"># Loading needed packages</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>coin<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;You are missing the package 'coin', we will now try to install it...&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">install.<span style="">packages</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;coin&quot;</span><span style="color: #080;">&#41;</span>		
		<span style="color: #0000FF; font-weight: bold;">library</span><span style="color: #080;">&#40;</span>coin<span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>multcomp<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;You are missing the package 'multcomp', we will now try to install it...&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">install.<span style="">packages</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;multcomp&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">library</span><span style="color: #080;">&#40;</span>multcomp<span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>colorspace<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;You are missing the package 'colorspace', we will now try to install it...&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">install.<span style="">packages</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;colorspace&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">library</span><span style="color: #080;">&#40;</span>colorspace<span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
	<span style="color: #228B22;"># get the names out of the formula</span>
	formu.<span style="">names</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">all.<span style="">vars</span></span><span style="color: #080;">&#40;</span>formu<span style="color: #080;">&#41;</span>
	Y.<span style="">name</span> <span style="color: #080;">&lt;-</span> formu.<span style="">names</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
	X.<span style="">name</span> <span style="color: #080;">&lt;-</span> formu.<span style="">names</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>
	block.<span style="">name</span> <span style="color: #080;">&lt;-</span> formu.<span style="">names</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">3</span><span style="color: #080;">&#93;</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">dim</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&gt;</span><span style="color: #ff0000;">3</span><span style="color: #080;">&#41;</span> <span style="color: #0000FF; font-weight: bold;">data</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,<span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>Y.<span style="">name</span>,X.<span style="">name</span>,block.<span style="">name</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span>	<span style="color: #228B22;"># In case we have a &quot;data&quot; data frame with more then the three columns we need. This code will clean it from them...</span>
&nbsp;
	<span style="color: #228B22;"># Note: the function doesn't handle NA's. In case of NA in one of the block T outcomes, that entire block should be removed.</span>
&nbsp;
	<span style="color: #228B22;"># stopping in case there is NA in the Y vector</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">sum</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">na</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,Y.<span style="">name</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&gt;</span> <span style="color: #ff0000;">0</span><span style="color: #080;">&#41;</span> <span style="color: #0000FF; font-weight: bold;">stop</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Function stopped: This function doesn't handle NA's. In case of NA in Y in one of the blocks, then that entire block should be removed.&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
	<span style="color: #228B22;"># make sure that the number of factors goes with the actual values present in the data:</span>
	<span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,X.<span style="">name</span> <span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">factor</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,X.<span style="">name</span> <span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,block.<span style="">name</span> <span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">factor</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,block.<span style="">name</span> <span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
	number.<span style="">of</span>.<span style="">X</span>.<span style="">levels</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">levels</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,X.<span style="">name</span> <span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>number.<span style="">of</span>.<span style="">X</span>.<span style="">levels</span> <span style="color: #080;">==</span> <span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span> <span style="color: #0000FF; font-weight: bold;">warning</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;'&quot;</span>,X.<span style="">name</span>,<span style="color: #ff0000;">&quot;'&quot;</span>, <span style="color: #ff0000;">&quot;has only two levels. Consider using paired wilcox.test instead of friedman test&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span>
&nbsp;
	<span style="color: #228B22;"># making the object that will hold the friedman test and the other.</span>
	the.<span style="">sym</span>.<span style="">test</span> <span style="color: #080;">&lt;-</span> symmetry_test<span style="color: #080;">&#40;</span>formu, <span style="color: #0000FF; font-weight: bold;">data</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">data</span>,	<span style="color: #228B22;">### all pairwise comparisons	</span>
						   teststat <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;max&quot;</span>,
						   xtrafo <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>Y.<span style="">data</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span> trafo<span style="color: #080;">&#40;</span> Y.<span style="">data</span>, factor_trafo <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span> <span style="color: #0000FF; font-weight: bold;">model.<span style="">matrix</span></span><span style="color: #080;">&#40;</span>~ x <span style="color: #080;">-</span> <span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span> <span style="color: #080;">%*%</span> <span style="color: #0000FF; font-weight: bold;">t</span><span style="color: #080;">&#40;</span>contrMat<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">table</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">&quot;Tukey&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#125;</span> <span style="color: #080;">&#41;</span> <span style="color: #080;">&#125;</span>,
						   ytrafo <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>Y.<span style="">data</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span> trafo<span style="color: #080;">&#40;</span>Y.<span style="">data</span>, numeric_trafo <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">rank</span>, block <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,block.<span style="">name</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&#41;</span> <span style="color: #080;">&#125;</span>
						<span style="color: #080;">&#41;</span>
	<span style="color: #228B22;"># if(to.print.friedman) { print(the.sym.test) }</span>
&nbsp;
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>to.<span style="">post</span>.<span style="">hoc</span>.<span style="">if</span>.<span style="">signif</span><span style="color: #080;">&#41;</span>
		<span style="color: #080;">&#123;</span>
			<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>pvalue<span style="color: #080;">&#40;</span>the.<span style="">sym</span>.<span style="">test</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&lt;</span> signif.<span style="">P</span><span style="color: #080;">&#41;</span>
			<span style="color: #080;">&#123;</span>
				<span style="color: #228B22;"># the post hoc test</span>
				The.<span style="">post</span>.<span style="">hoc</span>.<span style="">P</span>.<span style="">values</span> <span style="color: #080;">&lt;-</span> pvalue<span style="color: #080;">&#40;</span>the.<span style="">sym</span>.<span style="">test</span>, method <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;single-step&quot;</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># this is the post hoc of the friedman test</span>
&nbsp;
&nbsp;
				<span style="color: #228B22;"># plotting</span>
				<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>to.<span style="">plot</span>.<span style="">parallel</span> <span style="color: #080;">&amp;</span> to.<span style="">plot</span>.<span style="">boxplot</span><span style="color: #080;">&#41;</span>	<span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>mfrow <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># if we are plotting two plots, let's make sure we'll be able to see both</span>
&nbsp;
				<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>to.<span style="">plot</span>.<span style="">parallel</span><span style="color: #080;">&#41;</span>
				<span style="color: #080;">&#123;</span>
					X.<span style="">names</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">levels</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>, X.<span style="">name</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
					X.<span style="">for</span>.<span style="">plot</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">seq_along</span><span style="color: #080;">&#40;</span>X.<span style="">names</span><span style="color: #080;">&#41;</span>
					plot.<span style="">xlim</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>.7 , <span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>X.<span style="">for</span>.<span style="">plot</span><span style="color: #080;">&#41;</span><span style="color: #080;">+</span>.3<span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># adding some spacing from both sides of the plot</span>
&nbsp;
					<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>color.<span style="">blocks</span>.<span style="">in</span>.<span style="">cor</span>.<span style="">plot</span><span style="color: #080;">&#41;</span> 
					<span style="color: #080;">&#123;</span>
						blocks.<span style="">col</span> <span style="color: #080;">&lt;-</span> rainbow_hcl<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">levels</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,block.<span style="">name</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
					<span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>
						blocks.<span style="">col</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">1</span> <span style="color: #228B22;"># black</span>
					<span style="color: #080;">&#125;</span>					
&nbsp;
					data2 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data</span>
					<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>jitter.<span style="">Y</span>.<span style="">in</span>.<span style="">cor</span>.<span style="">plot</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span>
						data2<span style="color: #080;">&#91;</span>,Y.<span style="">name</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">jitter</span><span style="color: #080;">&#40;</span>data2<span style="color: #080;">&#91;</span>,Y.<span style="">name</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
						par.<span style="">cor</span>.<span style="">plot</span>.<span style="">text</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">&quot;Parallel coordinates plot (with Jitter)&quot;</span>				
					<span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>
						par.<span style="">cor</span>.<span style="">plot</span>.<span style="">text</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">&quot;Parallel coordinates plot&quot;</span>
					<span style="color: #080;">&#125;</span>				
&nbsp;
					<span style="color: #228B22;"># adding a Parallel coordinates plot</span>
					<span style="color: #0000FF; font-weight: bold;">matplot</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">as.<span style="">matrix</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">reshape</span><span style="color: #080;">&#40;</span>data2,  idvar<span style="color: #080;">=</span>X.<span style="">name</span>, timevar<span style="color: #080;">=</span>block.<span style="">name</span>,
									 direction<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;wide&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>  , 
							type <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;l&quot;</span>,  lty <span style="color: #080;">=</span> <span style="color: #ff0000;">1</span>, axes <span style="color: #080;">=</span> FALSE, ylab <span style="color: #080;">=</span> Y.<span style="">name</span>, 
							xlim <span style="color: #080;">=</span> plot.<span style="">xlim</span>,
							<span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> blocks.<span style="">col</span>,
							main <span style="color: #080;">=</span> par.<span style="">cor</span>.<span style="">plot</span>.<span style="">text</span><span style="color: #080;">&#41;</span>
					<span style="color: #0000FF; font-weight: bold;">axis</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>, at <span style="color: #080;">=</span> X.<span style="">for</span>.<span style="">plot</span> , <span style="color: #0000FF; font-weight: bold;">labels</span> <span style="color: #080;">=</span> X.<span style="">names</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># plot X axis</span>
					<span style="color: #0000FF; font-weight: bold;">axis</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># plot Y axis</span>
					<span style="color: #0000FF; font-weight: bold;">points</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">tapply</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,Y.<span style="">name</span><span style="color: #080;">&#93;</span>, <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,X.<span style="">name</span><span style="color: #080;">&#93;</span>, <span style="color: #0000FF; font-weight: bold;">median</span><span style="color: #080;">&#41;</span> ~ X.<span style="">for</span>.<span style="">plot</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;red&quot;</span>,pch <span style="color: #080;">=</span> <span style="color: #ff0000;">4</span>, cex <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span>, lwd <span style="color: #080;">=</span> <span style="color: #ff0000;">5</span><span style="color: #080;">&#41;</span>
				<span style="color: #080;">&#125;</span>
&nbsp;
				<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>to.<span style="">plot</span>.<span style="">boxplot</span><span style="color: #080;">&#41;</span>
				<span style="color: #080;">&#123;</span>
					<span style="color: #228B22;"># first we create a function to create a new Y, by substracting different combinations of X levels from each other.</span>
					subtract.<span style="">a</span>.<span style="">from</span>.<span style="">b</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>a.<span style="">b</span> , the.<span style="">data</span><span style="color: #080;">&#41;</span>
					<span style="color: #080;">&#123;</span>
						the.<span style="">data</span><span style="color: #080;">&#91;</span>,a.<span style="">b</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span> <span style="color: #080;">-</span> the.<span style="">data</span><span style="color: #080;">&#91;</span>,a.<span style="">b</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#93;</span>
					<span style="color: #080;">&#125;</span>
&nbsp;
					temp.<span style="">wide</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">reshape</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>,  idvar<span style="color: #080;">=</span>X.<span style="">name</span>, timevar<span style="color: #080;">=</span>block.<span style="">name</span>,
									 direction<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;wide&quot;</span><span style="color: #080;">&#41;</span> 	<span style="color: #228B22;">#[,-1]</span>
					wide.<span style="">data</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">as.<span style="">matrix</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">t</span><span style="color: #080;">&#40;</span>temp.<span style="">wide</span><span style="color: #080;">&#91;</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
					<span style="color: #0000FF; font-weight: bold;">colnames</span><span style="color: #080;">&#40;</span>wide.<span style="">data</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&lt;-</span> temp.<span style="">wide</span><span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
&nbsp;
					Y.<span style="">b</span>.<span style="">minus</span>.<span style="">a</span>.<span style="">combos</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">apply</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">with</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>,<span style="color: #0000FF; font-weight: bold;">combn</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">levels</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,X.<span style="">name</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">2</span>, subtract.<span style="">a</span>.<span style="">from</span>.<span style="">b</span>, the.<span style="">data</span> <span style="color: #080;">=</span>wide.<span style="">data</span><span style="color: #080;">&#41;</span>
					names.<span style="">b</span>.<span style="">minus</span>.<span style="">a</span>.<span style="">combos</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">apply</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">with</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span>,<span style="color: #0000FF; font-weight: bold;">combn</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">levels</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">&#91;</span>,X.<span style="">name</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">2</span>, <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>a.<span style="">b</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span><span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span>a.<span style="">b</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>,a.<span style="">b</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>,sep<span style="color: #080;">=</span><span style="color: #ff0000;">&quot; - &quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span>
&nbsp;
					the.<span style="">ylim</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">range</span><span style="color: #080;">&#40;</span>Y.<span style="">b</span>.<span style="">minus</span>.<span style="">a</span>.<span style="">combos</span><span style="color: #080;">&#41;</span>
					the.<span style="">ylim</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> the.<span style="">ylim</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span> <span style="color: #080;">+</span> <span style="color: #0000FF; font-weight: bold;">max</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">sd</span><span style="color: #080;">&#40;</span>Y.<span style="">b</span>.<span style="">minus</span>.<span style="">a</span>.<span style="">combos</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># adding some space for the labels</span>
					is.<span style="">signif</span>.<span style="">color</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">ifelse</span><span style="color: #080;">&#40;</span>The.<span style="">post</span>.<span style="">hoc</span>.<span style="">P</span>.<span style="">values</span> <span style="color: #080;">&lt;</span> .05 , <span style="color: #ff0000;">&quot;green&quot;</span>, <span style="color: #ff0000;">&quot;grey&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
					<span style="color: #0000FF; font-weight: bold;">boxplot</span><span style="color: #080;">&#40;</span>Y.<span style="">b</span>.<span style="">minus</span>.<span style="">a</span>.<span style="">combos</span>,
						<span style="color: #0000FF; font-weight: bold;">names</span> <span style="color: #080;">=</span> names.<span style="">b</span>.<span style="">minus</span>.<span style="">a</span>.<span style="">combos</span> ,
						<span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> is.<span style="">signif</span>.<span style="">color</span>,
						main <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;Boxplots (of the differences)&quot;</span>,
						ylim <span style="color: #080;">=</span> the.<span style="">ylim</span>
						<span style="color: #080;">&#41;</span>
					<span style="color: #0000FF; font-weight: bold;">legend</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;topright&quot;</span>, <span style="color: #0000FF; font-weight: bold;">legend</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span>names.<span style="">b</span>.<span style="">minus</span>.<span style="">a</span>.<span style="">combos</span>, <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot; ; PostHoc P.value:&quot;</span>, number.<span style="">of</span>.<span style="">X</span>.<span style="">levels</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">round</span><span style="color: #080;">&#40;</span>The.<span style="">post</span>.<span style="">hoc</span>.<span style="">P</span>.<span style="">values</span>,<span style="color: #ff0000;">5</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> , fill <span style="color: #080;">=</span>  is.<span style="">signif</span>.<span style="">color</span> <span style="color: #080;">&#41;</span>
					<span style="color: #0000FF; font-weight: bold;">abline</span><span style="color: #080;">&#40;</span>h <span style="color: #080;">=</span> <span style="color: #ff0000;">0</span>, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;blue&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
				<span style="color: #080;">&#125;</span>
&nbsp;
				list.<span style="">to</span>.<span style="">return</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">list</span><span style="color: #080;">&#40;</span>Friedman.<span style="">Test</span> <span style="color: #080;">=</span> the.<span style="">sym</span>.<span style="">test</span>, PostHoc.<span style="">Test</span> <span style="color: #080;">=</span> The.<span style="">post</span>.<span style="">hoc</span>.<span style="">P</span>.<span style="">values</span><span style="color: #080;">&#41;</span>
				<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>to.<span style="">print</span>.<span style="">friedman</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#123;</span><span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span>list.<span style="">to</span>.<span style="">return</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span>				
				<span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span>list.<span style="">to</span>.<span style="">return</span><span style="color: #080;">&#41;</span>
&nbsp;
			<span style="color: #080;">&#125;</span>	<span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>
					<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;The results where not significant, There is no need for a post hoc test&quot;</span><span style="color: #080;">&#41;</span>
					<span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span>the.<span style="">sym</span>.<span style="">test</span><span style="color: #080;">&#41;</span>
				<span style="color: #080;">&#125;</span>					
	<span style="color: #080;">&#125;</span>
&nbsp;
<span style="color: #228B22;"># Original credit (for linking online, to the package that performs the post hoc test) goes to &quot;David Winsemius&quot;, see:</span>
<span style="color: #228B22;"># http://tolstoy.newcastle.edu.au/R/e8/help/09/10/1416.html</span>
<span style="color: #080;">&#125;</span></pre></td></tr></table></div>

<h3>Example</h3>
<p>(The code for the example is given at the end of the post)</p>
<p>Let&#8217;s make up a little story: let&#8217;s say we have three types of wine (A, B and C), and we would like to know which one is the best one (in a scale of 1 to 7). We asked 22 friends to taste each of the three wines (in a blind fold fashion), and then to give a grade of 1 till 7 (for example sake, let&#8217;s say we asked them to rate the wines 5 times each, and then averaged their results to give a number for a persons preference for each wine. This number which is now an average of several numbers, will not necessarily be an integer).</p>
<p>After getting the results, we started by performing a simple boxplot of the ratings each wine got. Here it is:</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/02/comparing-wines-boxplot.png"><img class="alignnone size-full wp-image-154" title="comparing wines - boxplot" src="http://www.r-statistics.com/wp-content/uploads/2010/02/comparing-wines-boxplot.png" alt="" width="500" /></a></p>
<p>The plot shows us two things: 1) that the assumption of equal variances here might not hold. 2) That if we are to ignore the &#8220;within subjects&#8221; data that we have, we have no chance of finding any difference between the wines.</p>
<p>So we move to using the function &#8220;friedman.test.with.post.hoc&#8221; on our data, and we get the following output:</p>
<blockquote>
<div id="_mcePaste">$Friedman.Test</div>
<div id="_mcePaste">Asymptotic General Independence Test</div>
<div id="_mcePaste">data:  Taste by</div>
<div id="_mcePaste">Wine (Wine A, Wine B, Wine C)</div>
<div id="_mcePaste">stratified by Taster</div>
<div id="_mcePaste">maxT = 3.2404, <strong>p-value = 0.003421</strong></div>
<div></div>
<div id="_mcePaste">$PostHoc.Test</div>
<div id="_mcePaste">Wine B &#8211; Wine A 0.623935139</div>
<div id="_mcePaste"><strong>Wine C &#8211; Wine A 0.003325929</strong></div>
<div id="_mcePaste">Wine C &#8211; Wine B 0.053772757</div>
</blockquote>
<p><strong><span style="text-decoration: underline;">The conclusion</span></strong> is that once we take into account the within subject variable, we discover that there is a significant difference between our three wines (significant P value of about  0.0034). And the posthoc analysis shows us that the difference is due to the difference in tastes between Wine C and Wine A (P value 0.003). and maybe also with the difference between Wine C and Wine B (the P value is 0.053, which is just borderline significant).</p>
<p>Plotting our analysis will also show us the direction of the results, and the connected answers of each of our friends answers:</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/02/posthoc-friedman-plots.png"><img class="alignnone size-full wp-image-153" title="posthoc friedman plots" src="http://www.r-statistics.com/wp-content/uploads/2010/02/posthoc-friedman-plots.png" alt="" width="500" /></a></p>
<p>Here is the code for the example:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
</pre></td><td class="code"><pre class="rsplus" style="font-family:monospace;">&nbsp;
<span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;http://www.r-statistics.com/wp-content/uploads/2010/02/Friedman-Test-with-Post-Hoc.r.txt&quot;</span><span style="color: #080;">&#41;</span>  <span style="color: #228B22;"># loading the friedman.test.with.post.hoc function from the internet</span>
&nbsp;
	<span style="color: #228B22;">### Comparison of three Wine (&quot;Wine A&quot;, &quot;Wine B&quot;, and</span>
	<span style="color: #228B22;">###  &quot;Wine C&quot;) for rounding first base. </span>
	WineTasting <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>
		  Taste <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">5.40</span>, <span style="color: #ff0000;">5.50</span>, <span style="color: #ff0000;">5.55</span>,
					<span style="color: #ff0000;">5.85</span>, <span style="color: #ff0000;">5.70</span>, <span style="color: #ff0000;">5.75</span>,
					<span style="color: #ff0000;">5.20</span>, <span style="color: #ff0000;">5.60</span>, <span style="color: #ff0000;">5.50</span>,
					<span style="color: #ff0000;">5.55</span>, <span style="color: #ff0000;">5.50</span>, <span style="color: #ff0000;">5.40</span>,
					<span style="color: #ff0000;">5.90</span>, <span style="color: #ff0000;">5.85</span>, <span style="color: #ff0000;">5.70</span>,
					<span style="color: #ff0000;">5.45</span>, <span style="color: #ff0000;">5.55</span>, <span style="color: #ff0000;">5.60</span>,
					<span style="color: #ff0000;">5.40</span>, <span style="color: #ff0000;">5.40</span>, <span style="color: #ff0000;">5.35</span>,
					<span style="color: #ff0000;">5.45</span>, <span style="color: #ff0000;">5.50</span>, <span style="color: #ff0000;">5.35</span>,
					<span style="color: #ff0000;">5.25</span>, <span style="color: #ff0000;">5.15</span>, <span style="color: #ff0000;">5.00</span>,
					<span style="color: #ff0000;">5.85</span>, <span style="color: #ff0000;">5.80</span>, <span style="color: #ff0000;">5.70</span>,
					<span style="color: #ff0000;">5.25</span>, <span style="color: #ff0000;">5.20</span>, <span style="color: #ff0000;">5.10</span>,
					<span style="color: #ff0000;">5.65</span>, <span style="color: #ff0000;">5.55</span>, <span style="color: #ff0000;">5.45</span>,
					<span style="color: #ff0000;">5.60</span>, <span style="color: #ff0000;">5.35</span>, <span style="color: #ff0000;">5.45</span>,
					<span style="color: #ff0000;">5.05</span>, <span style="color: #ff0000;">5.00</span>, <span style="color: #ff0000;">4.95</span>,
					<span style="color: #ff0000;">5.50</span>, <span style="color: #ff0000;">5.50</span>, <span style="color: #ff0000;">5.40</span>,
					<span style="color: #ff0000;">5.45</span>, <span style="color: #ff0000;">5.55</span>, <span style="color: #ff0000;">5.50</span>,
					<span style="color: #ff0000;">5.55</span>, <span style="color: #ff0000;">5.55</span>, <span style="color: #ff0000;">5.35</span>,
					<span style="color: #ff0000;">5.45</span>, <span style="color: #ff0000;">5.50</span>, <span style="color: #ff0000;">5.55</span>,
					<span style="color: #ff0000;">5.50</span>, <span style="color: #ff0000;">5.45</span>, <span style="color: #ff0000;">5.25</span>,
					<span style="color: #ff0000;">5.65</span>, <span style="color: #ff0000;">5.60</span>, <span style="color: #ff0000;">5.40</span>,
					<span style="color: #ff0000;">5.70</span>, <span style="color: #ff0000;">5.65</span>, <span style="color: #ff0000;">5.55</span>,
					<span style="color: #ff0000;">6.30</span>, <span style="color: #ff0000;">6.30</span>, <span style="color: #ff0000;">6.25</span><span style="color: #080;">&#41;</span>,
					Wine <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">factor</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Wine A&quot;</span>, <span style="color: #ff0000;">&quot;Wine B&quot;</span>, <span style="color: #ff0000;">&quot;Wine C&quot;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">22</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
					Taster <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">factor</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">22</span>, <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">3</span>, <span style="color: #ff0000;">22</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">with</span><span style="color: #080;">&#40;</span>WineTasting , <span style="color: #0000FF; font-weight: bold;">boxplot</span><span style="color: #080;">&#40;</span> Taste  ~ Wine <span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># boxploting </span>
	friedman.<span style="">test</span>.<span style="">with</span>.<span style="">post</span>.<span style="">hoc</span><span style="color: #080;">&#40;</span>Taste ~ Wine <span style="color: #080;">|</span> Taster ,WineTasting<span style="color: #080;">&#41;</span>	<span style="color: #228B22;"># the same with our function. With post hoc, and cool plots</span></pre></td></tr></table></div>

<p>If you find this code useful, please let me know (in the comments) so I will know there is a point in publishing more such code snippets&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/02/post-hoc-analysis-for-friedmans-test-r-code/feed/</wfw:commentRss>
		<slash:comments>27</slash:comments>
		</item>
		<item>
		<title>Barnard&#8217;s exact test &#8211; a powerful alternative for Fisher&#8217;s exact test (implemented in R)</title>
		<link>http://www.r-statistics.com/2010/02/barnards-exact-test-a-powerful-alternative-for-fishers-exact-test-implemented-in-r/</link>
		<comments>http://www.r-statistics.com/2010/02/barnards-exact-test-a-powerful-alternative-for-fishers-exact-test-implemented-in-r/#comments</comments>
		<pubDate>Sun, 07 Feb 2010 10:12:10 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[Barnard's test]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[contingency tables]]></category>
		<category><![CDATA[Fisher's Exact test]]></category>
		<category><![CDATA[non-parametric]]></category>
		<category><![CDATA[non-parametric test]]></category>
		<category><![CDATA[nuisance parameter]]></category>
		<category><![CDATA[R code]]></category>
		<category><![CDATA[tables]]></category>
		<category><![CDATA[Wald statistic]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=75</guid>
		<description><![CDATA[(The R code for Barnard&#8217;s exact test is at the end of the article, and you could also just download it from here) About Barnard&#8217;s exact test About half a year ago, I was studying various statistical methods to employ on contingency tables. I came across a promising method for 2×2 contingency tables called &#8220;Barnard&#8217;s exact test&#8220;. Barnard&#8217;s test is a non-parametric alternative to Fisher&#8217;s exact test which can be more powerful (for 2×2 tables) but is also more time-consuming to compute (References can be [...]]]></description>
			<content:encoded><![CDATA[<p><em>(The R code for Barnard&#8217;s exact test is at the end of the article, and you could also just <a href="http://www.r-statistics.com/wp-content/uploads/2010/02/Barnardtest.R.txt">download it from here</a>)</em></p>
<p><em><a href="http://www.r-statistics.com/wp-content/uploads/2010/02/Barnards-exact-test-p-value-based-on-the-nuisance-parameter.png"><img title="Barnards exact test - p-value based on the nuisance parameter" src="http://www.r-statistics.com/wp-content/uploads/2010/02/Barnards-exact-test-p-value-based-on-the-nuisance-parameter.png" alt="" width="500" /></a></em></p>
<h3>About Barnard&#8217;s exact test</h3>
<p>About half a year ago, I was studying various statistical methods to employ on contingency tables. I came across a promising method for 2×2 contingency tables called &#8220;<strong>Barnard&#8217;s exact test</strong>&#8220;. Barnard&#8217;s test is a non-parametric alternative to <a title="Fisher's exact test" href="http://en.wikipedia.org/wiki/Fisher%27s_exact_test">Fisher&#8217;s exact test</a> which can be more powerful (for 2×2 tables) but is also more time-consuming to compute (References can be found in the <a href="http://en.wikipedia.org/wiki/Barnard%27s_test">Wikipedia article</a> on the subject).</p>
<p>The test was first published by <a title="George Alfred Barnard" href="http://en.wikipedia.org/wiki/George_Alfred_Barnard">George Alfred Barnard</a> (1945). <a href="http://www.cytel.com/Papers/twobinomials.pdf">Mehta and Senchaudhuri (2003)</a> explain why Barnard&#8217;s test can be more powerful than Fisher&#8217;s under certain conditions:</p>
<blockquote><p>When comparing Fisher’s and Barnard’s exact tests, the loss of power due to the greater discreteness of the Fisher statistic is somewhat offset by the requirement that Barnard’s exact test must maximize over all possible p-values, by choice of the nuisance parameter, π. <strong>For 2 × 2 tables </strong>the loss of power due to the discreteness dominates over the loss of power due to the maximization, resulting in<strong> greater power for Barnard’s exact test</strong>. But as the number of rows and columns of the observed table increase, the maximizing factor will tend to dominate, and Fisher’s exact test will achieve greater power than Barnard’s.</p></blockquote>
<h3>About the R implementation of Barnard&#8217;s exact test</h3>
<p>After finding about Barnard&#8217;s test I was sad to discover that (at the time) there had been no R implementation of it. But last week, I received a surprising e-mail with good news. The sender, <strong>Peter Calhoun</strong>, currently a graduate student at the University of Florida, had implemented the algorithm in R. Peter had  found my posting on the R mailing list (from almost half a year ago) and was so kind as to share with me (and the rest of the R community) his R code for computing Barnard&#8217;s exact test. Here is some of what Peter wrote to me about his code:</p>
<blockquote><p>On a side note, I believe <strong>there are more efficient codes than this one</strong>.  For example, I&#8217;ve seen codes in Matlab that run faster and display nicer-looking graphs.  However, this code will still provide accurate results and a plot that gives the p-value based on the nuisance parameter.  I did not come up with the idea of this code, I simply translated Matlab code into R, occasionally using different methods to get the same result.  The code was translated from:</p>
<p>Trujillo-Ortiz, A., R. Hernandez-Walls, A. Castro-Perez, L. Rodriguez-Cardozo. Probability Test.  A MATLAB file. URL</p>
<p>http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=6198</p>
<p>My goal was to make this test accessible to everyone.  Although there are many ways to run this test through Matlab, I hadn&#8217;t seen any code to implement this test in R.  I hope it is useful for you, and if you have any questions or ways to improve this code, please contact me at calhoun.peter@gmail.com</p></blockquote>
<p>p.s: I added some minor cosmetics to the code, like allowing the input to be a table/matrix and the output to be a list.</p>
<p><em><span id="more-75"></span></em></p>
<h3>The R function for Barnard&#8217;s exact test</h3>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
</pre></td><td class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #228B22;"># published on: </span>
<span style="color: #228B22;"># http://www.r-statistics.com/2010/02/barnards-exact-test-a-powerful-alternative-for-fishers-exact-test-implemented-in-r/</span>
&nbsp;
Barnardextest<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>Ta,Tb <span style="color: #080;">=</span>NULL,Tc <span style="color: #080;">=</span>NULL,Td <span style="color: #080;">=</span>NULL, to.<span style="">print</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">F</span>, to.<span style="">plot</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
<span style="color: #228B22;"># The first argument (Ta) can be either a table or a matrix of 2X2.</span>
<span style="color: #228B22;"># Or instead, the values of the table can be entered one by one to the function</span>
&nbsp;
<span style="color: #228B22;">#Barnard's test calculates the probabilities for contingency tables.  It has been shown that for 2x2 tables, Barnard's test</span>
<span style="color: #228B22;">#has a higher power than Fisher's Exact test.  Barnard's test is a non-parametric test that relies upon a computer to generate</span>
<span style="color: #228B22;">#the distribution of the Wald statistic.  Using a computer program, one could find the nuisance parameter that maximizes the </span>
<span style="color: #228B22;">#probability of the observations displayed from a table.</span>
<span style="color: #228B22;">#Despite giving lower p-values for 2x2 tables, Barnard's test hasn't been used as often as Fisher's test because of its</span>
<span style="color: #228B22;">#computational difficulty.  This code gives the Wald statistic, the nuisance parameter, and the p-value for any 2x2 table.</span>
<span style="color: #228B22;">#The table can be written as:</span>
<span style="color: #228B22;">#			Var.1</span>
<span style="color: #228B22;">#		 ---------------</span>
<span style="color: #228B22;">#		   a		b	 r1=a+b</span>
<span style="color: #228B22;">#	Var.2</span>
<span style="color: #228B22;">#		   c		d	 r2=c+d</span>
<span style="color: #228B22;">#		 ---------------</span>
<span style="color: #228B22;">#		 c1=a+c   c2=b+d	 n=c1+c2</span>
&nbsp;
<span style="color: #228B22;">#One example would be </span>
<span style="color: #228B22;">#				Physics</span>
<span style="color: #228B22;">#			 Pass	     Fail</span>
<span style="color: #228B22;">#			 ---------------</span>
<span style="color: #228B22;">#		Crane	   8		14</span>
<span style="color: #228B22;">#  Collage	</span>
<span style="color: #228B22;">#		Egret	   1		3</span>
<span style="color: #228B22;">#			 ---------------</span>
<span style="color: #228B22;">#</span>
<span style="color: #228B22;">#After implementing the function, simply call it by the command:</span>
<span style="color: #228B22;">#Barnardextest(8,14,1,3)</span>
<span style="color: #228B22;">#This will display the results:</span>
&nbsp;
<span style="color: #228B22;">#&quot;The contingency table is:&quot;</span>
<span style="color: #228B22;">#      [,1] [,2]</span>
<span style="color: #228B22;">#[1,]    8   14</span>
<span style="color: #228B22;">#[2,]    1    3</span>
<span style="color: #228B22;">#&quot;Wald Statistic:&quot;</span>
<span style="color: #228B22;">#0.43944</span>
<span style="color: #228B22;">#&quot;Nuisance parameter:&quot;</span>
<span style="color: #228B22;">#0.9001</span>
<span style="color: #228B22;">#&quot;The 1-tailed p-value:&quot;</span>
<span style="color: #228B22;">#0.4159073</span>
&nbsp;
<span style="color: #228B22;">#On a side note, I believe there are more efficient codes than this one.  For example, I've seen codes in Matlab that run</span>
<span style="color: #228B22;">#faster and display nicer-looking graphs.  However, this code will still provide accurate results and a plot that gives the</span>
<span style="color: #228B22;">#p-value based on the nuisance parameter.  I did not come up with the idea of this code, I simply translated Matlab code </span>
<span style="color: #228B22;">#into R, occasionally using different methods to get the same result.  The code was translated from:</span>
<span style="color: #228B22;">#</span>
<span style="color: #228B22;">#Trujillo-Ortiz, A., R. Hernandez-Walls, A. Castro-Perez, L. Rodriguez-Cardozo. Probability Test.  A MATLAB file. URL</span>
<span style="color: #228B22;">#http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=6198</span>
<span style="color: #228B22;">#</span>
<span style="color: #228B22;">#My goal was to make this test accessible to everyone.  Although there are many ways to run this test through Matlab, I hadn't</span>
<span style="color: #228B22;">#seen any code to implement this test in R.  I hope it is useful for you, and if you have any questions or ways to improve</span>
<span style="color: #228B22;">#this code, please contact me at calhoun.peter@gmail.com.</span>
&nbsp;
&nbsp;
	<span style="color: #228B22;"># Tal edit: choosing if to work with a 2X2 table or with 4 numbers:</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">null</span></span><span style="color: #080;">&#40;</span>Tb<span style="color: #080;">&#41;</span> <span style="color: #080;">|</span> <span style="color: #0000FF; font-weight: bold;">is.<span style="">null</span></span><span style="color: #080;">&#40;</span>Tc<span style="color: #080;">&#41;</span> <span style="color: #080;">|</span> <span style="color: #0000FF; font-weight: bold;">is.<span style="">null</span></span><span style="color: #080;">&#40;</span>Td<span style="color: #080;">&#41;</span> <span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		<span style="color: #228B22;"># If one of them is null, then Ta should have an entire table, and we can take it's values</span>
		<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">table</span></span><span style="color: #080;">&#40;</span>Ta<span style="color: #080;">&#41;</span> <span style="color: #080;">|</span> <span style="color: #0000FF; font-weight: bold;">is.<span style="">matrix</span></span><span style="color: #080;">&#40;</span>Ta<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
		<span style="color: #080;">&#123;</span>
			<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">sum</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">dim</span><span style="color: #080;">&#40;</span>Ta<span style="color: #080;">&#41;</span> <span style="color: #080;">==</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">2</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">==</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span>
			<span style="color: #080;">&#123;</span>
				Tb <span style="color: #080;">&lt;-</span> Ta<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>
				Tc <span style="color: #080;">&lt;-</span> Ta<span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
				Td <span style="color: #080;">&lt;-</span> Ta<span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>
				Ta <span style="color: #080;">&lt;-</span> Ta<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>		
			<span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span><span style="color: #0000FF; font-weight: bold;">stop</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;The table is not 2X2, please check it again...&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#125;</span>		
		<span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span><span style="color: #0000FF; font-weight: bold;">stop</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;We are missing value in the table&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#125;</span>		
	<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
	c1<span style="color: #080;">&lt;-</span>Ta<span style="color: #080;">+</span>Tc
	c2<span style="color: #080;">&lt;-</span>Tb<span style="color: #080;">+</span>Td
	n<span style="color: #080;">&lt;-</span>c1<span style="color: #080;">+</span>c2
	pao<span style="color: #080;">&lt;-</span>Ta<span style="color: #080;">/</span>c1
	pbo<span style="color: #080;">&lt;-</span>Tb<span style="color: #080;">/</span>c2
	pxo<span style="color: #080;">&lt;-</span><span style="color: #080;">&#40;</span>Ta<span style="color: #080;">+</span>Tb<span style="color: #080;">&#41;</span><span style="color: #080;">/</span>n
	TXO<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">abs</span><span style="color: #080;">&#40;</span>pao<span style="color: #080;">-</span>pbo<span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #0000FF; font-weight: bold;">sqrt</span><span style="color: #080;">&#40;</span>pxo<span style="color: #080;">*</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">-</span>pxo<span style="color: #080;">&#41;</span><span style="color: #080;">*</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">/</span>c1<span style="color: #080;">+</span><span style="color: #ff0000;">1</span><span style="color: #080;">/</span>c2<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	n1<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">prod</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span>c1<span style="color: #080;">&#41;</span>
	n2<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">prod</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span>c2<span style="color: #080;">&#41;</span>
&nbsp;
	P<span style="color: #080;">&lt;-</span><span style="color: #080;">&#123;</span><span style="color: #080;">&#125;</span>
	<span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span> p <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span><span style="color: #080;">:</span><span style="color: #ff0000;">99</span><span style="color: #080;">+</span>.01<span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #ff0000;">100</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		TX<span style="color: #080;">&lt;-</span><span style="color: #080;">&#123;</span><span style="color: #080;">&#125;</span>
		S<span style="color: #080;">&lt;-</span><span style="color: #080;">&#123;</span><span style="color: #080;">&#125;</span>
		<span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span> i <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span><span style="color: #080;">:</span>c1<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
		<span style="color: #080;">&#123;</span>
			<span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span> j <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span><span style="color: #080;">:</span>c2<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
			<span style="color: #080;">&#123;</span>
				<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">prod</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span>i<span style="color: #080;">&#41;</span><span style="color: #080;">==</span><span style="color: #ff0000;">0</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>fac1<span style="color: #080;">&lt;-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>fac1<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">prod</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span>i<span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span>
				<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">prod</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span>j<span style="color: #080;">&#41;</span><span style="color: #080;">==</span><span style="color: #ff0000;">0</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>fac2<span style="color: #080;">&lt;-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>fac2<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">prod</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span>j<span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span>
				<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">prod</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #080;">&#40;</span>c1<span style="color: #080;">-</span>i<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">==</span><span style="color: #ff0000;">0</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>fac3<span style="color: #080;">&lt;-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>fac3<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">prod</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #080;">&#40;</span>c1<span style="color: #080;">-</span>i<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span>
				<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">prod</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #080;">&#40;</span>c2<span style="color: #080;">-</span>j<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">==</span><span style="color: #ff0000;">0</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>fac4<span style="color: #080;">&lt;-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>fac4<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">prod</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #080;">&#40;</span>c2<span style="color: #080;">-</span>j<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#125;</span>
&nbsp;
				small.<span style="">s</span><span style="color: #080;">&lt;-</span><span style="color: #080;">&#40;</span>n1<span style="color: #080;">*</span>n2<span style="color: #080;">*</span><span style="color: #080;">&#40;</span>p<span style="color: #080;">^</span><span style="color: #080;">&#40;</span>i<span style="color: #080;">+</span>j<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">*</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">-</span>p<span style="color: #080;">&#41;</span><span style="color: #080;">^</span><span style="color: #080;">&#40;</span>n<span style="color: #080;">-</span><span style="color: #080;">&#40;</span>i<span style="color: #080;">+</span>j<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #080;">&#40;</span>fac1<span style="color: #080;">*</span>fac2<span style="color: #080;">*</span>fac3<span style="color: #080;">*</span>fac4<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
				S<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span>S,small.<span style="">s</span><span style="color: #080;">&#41;</span>
				pa<span style="color: #080;">&lt;-</span> i<span style="color: #080;">/</span>c1
				pb<span style="color: #080;">&lt;-</span>j<span style="color: #080;">/</span>c2
				px <span style="color: #080;">&lt;-</span> <span style="color: #080;">&#40;</span>i<span style="color: #080;">+</span>j<span style="color: #080;">&#41;</span><span style="color: #080;">/</span>n
				<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">nan</span></span><span style="color: #080;">&#40;</span><span style="color: #080;">&#40;</span>pa<span style="color: #080;">-</span>pb<span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #0000FF; font-weight: bold;">sqrt</span><span style="color: #080;">&#40;</span>px<span style="color: #080;">*</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">-</span>px<span style="color: #080;">&#41;</span><span style="color: #080;">*</span><span style="color: #080;">&#40;</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">/</span>c1<span style="color: #080;">&#41;</span><span style="color: #080;">+</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">/</span>c2<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
					<span style="color: #080;">&#123;</span>
						tx<span style="color: #080;">&lt;-</span><span style="color: #ff0000;">0</span>
					<span style="color: #080;">&#125;</span> <span style="color: #0000FF; font-weight: bold;">else</span> <span style="color: #080;">&#123;</span>
						tx <span style="color: #080;">&lt;-</span> <span style="color: #080;">&#40;</span>pa<span style="color: #080;">-</span>pb<span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #0000FF; font-weight: bold;">sqrt</span><span style="color: #080;">&#40;</span>px<span style="color: #080;">*</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">-</span>px<span style="color: #080;">&#41;</span><span style="color: #080;">*</span><span style="color: #080;">&#40;</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">/</span>c1<span style="color: #080;">&#41;</span><span style="color: #080;">+</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">/</span>c2<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
					<span style="color: #080;">&#125;</span>
				TX<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span>TX,tx<span style="color: #080;">&#41;</span>
			<span style="color: #080;">&#125;</span>
		<span style="color: #080;">&#125;</span>
&nbsp;
		P<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">cbind</span><span style="color: #080;">&#40;</span>P,<span style="color: #0000FF; font-weight: bold;">sum</span><span style="color: #080;">&#40;</span>S<span style="color: #080;">&#91;</span><span style="color: #0000FF; font-weight: bold;">which</span><span style="color: #080;">&#40;</span>TX<span style="color: #080;">&gt;=</span>TXO<span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
	np<span style="color: #080;">&lt;-</span><span style="color: #0000FF; font-weight: bold;">which</span><span style="color: #080;">&#40;</span>P<span style="color: #080;">==</span><span style="color: #0000FF; font-weight: bold;">max</span><span style="color: #080;">&#40;</span>P<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
	p <span style="color: #080;">&lt;-</span> <span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span><span style="color: #080;">:</span><span style="color: #ff0000;">99</span><span style="color: #080;">+</span>.01<span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #ff0000;">100</span>
	nuisance<span style="color: #080;">&lt;-</span>p<span style="color: #080;">&#91;</span>np<span style="color: #080;">&#93;</span>
	pv<span style="color: #080;">&lt;-</span>P<span style="color: #080;">&#91;</span>np<span style="color: #080;">&#93;</span>
&nbsp;
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>to.<span style="">print</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span> 
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;The contingency table is:&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">matrix</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>Ta,Tc,Tb,Td<span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">2</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Wald Statistic:&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span>TXO<span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Nuisance parameter:&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span>nuisance<span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;The 1-tailed p-value:&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">print</span><span style="color: #080;">&#40;</span>pv<span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
	<span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span>to.<span style="">plot</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#123;</span>
		<span style="color: #0000FF; font-weight: bold;">plot</span><span style="color: #080;">&#40;</span>p,P,type<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;l&quot;</span>,main<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Barnard's exact P-value&quot;</span>, xlab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Nuisance parameter&quot;</span>, ylab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;P-value&quot;</span><span style="color: #080;">&#41;</span>
		<span style="color: #0000FF; font-weight: bold;">points</span><span style="color: #080;">&#40;</span>nuisance,pv,<span style="color: #0000FF; font-weight: bold;">col</span><span style="color: #080;">=</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span>
	<span style="color: #080;">&#125;</span>
&nbsp;
	<span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span>	
			<span style="color: #0000FF; font-weight: bold;">list</span><span style="color: #080;">&#40;</span>
					contingency.<span style="">table</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">as.<span style="">table</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">matrix</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>Ta,Tc,Tb,Td<span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">2</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,
					Wald.<span style="">Statistic</span> <span style="color: #080;">=</span> TXO,
					Nuisance.<span style="">parameter</span> <span style="color: #080;">=</span> nuisance,
					p.<span style="">value</span>.<span style="">one</span>.<span style="">tailed</span> <span style="color: #080;">=</span> pv					
				<span style="color: #080;">&#41;</span>
			<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
Barnardextest<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">matrix</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">8</span>,<span style="color: #ff0000;">1</span>,<span style="color: #ff0000;">14</span>,<span style="color: #ff0000;">3</span><span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">2</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">fisher.<span style="">test</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">matrix</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">8</span>,<span style="color: #ff0000;">1</span>,<span style="color: #ff0000;">14</span>,<span style="color: #ff0000;">3</span><span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">2</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
&nbsp;
Convictions <span style="color: #080;">&lt;-</span>
<span style="color: #0000FF; font-weight: bold;">matrix</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">2</span>, <span style="color: #ff0000;">10</span>, <span style="color: #ff0000;">15</span>, <span style="color: #ff0000;">3</span><span style="color: #080;">&#41;</span>,
		   <span style="color: #0000FF; font-weight: bold;">nrow</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span>,
		   <span style="color: #0000FF; font-weight: bold;">dimnames</span> <span style="color: #080;">=</span>
		   <span style="color: #0000FF; font-weight: bold;">list</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Dizygotic&quot;</span>, <span style="color: #ff0000;">&quot;Monozygotic&quot;</span><span style="color: #080;">&#41;</span>,
				<span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Convicted&quot;</span>, <span style="color: #ff0000;">&quot;Not convicted&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
Convictions
<span style="color: #0000FF; font-weight: bold;">fisher.<span style="">test</span></span><span style="color: #080;">&#40;</span>Convictions, alternative <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;less&quot;</span><span style="color: #080;">&#41;</span>
Barnardextest<span style="color: #080;">&#40;</span>Convictions<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<blockquote><p>Fisher&#8217;s Exact Test for Count Data</p>
<p>data:  Convictions<br />
p-value = 0.0004652<br />
alternative hypothesis: true odds ratio is less than 1<br />
95 percent confidence interval:<br />
0.0000000 0.2849601<br />
sample estimates:<br />
odds ratio<br />
0.04693661</p></blockquote>
<blockquote><p>$contingency.table<br />
A  B<br />
A  2 15<br />
B 10  3</p>
<p>$Wald.Statistic<br />
[1] 3.609941</p>
<p>$Nuisance.parameter<br />
[1] 0.4401</p>
<p>$p.value.one.tailed<br />
[1] 0.0001528846</p></blockquote>
<p><strong>Final note</strong>: I would like to thank <strong>Peter Calhoun</strong> again for sharing his code with the rest of us &#8211; Thanks Peter!</p>
<p><strong>Update (21.04.2010)</strong>: In case you are facing a table with structural zeros (that is, missing values in the table), the package <a href="http://cran.r-project.org/web/packages/aylmer/index.html">aylmer</a> might be able to help you (it offers a generalization of Fisher&#8217;s exact test)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/02/barnards-exact-test-a-powerful-alternative-for-fishers-exact-test-implemented-in-r/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
