<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>R-statistics blog &#187; survey</title>
	<atom:link href="http://www.r-statistics.com/tag/survey/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.r-statistics.com</link>
	<description>Writing about statistics with R, and open source stuff (software, data, community)</description>
	<lastBuildDate>Thu, 29 Jul 2010 01:51:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Correlation scatter-plot matrix for ordered-categorical data</title>
		<link>http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/</link>
		<comments>http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/#comments</comments>
		<pubDate>Wed, 07 Apr 2010 21:37:26 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[correlation]]></category>
		<category><![CDATA[correlation matrix]]></category>
		<category><![CDATA[correlation scatter plot]]></category>
		<category><![CDATA[non-parametric]]></category>
		<category><![CDATA[non-parametric test]]></category>
		<category><![CDATA[nonparametric]]></category>
		<category><![CDATA[nonparametric test]]></category>
		<category><![CDATA[R code]]></category>
		<category><![CDATA[scatter plot]]></category>
		<category><![CDATA[scatter plot matrix]]></category>
		<category><![CDATA[spearman correlation]]></category>
		<category><![CDATA[spearman test]]></category>
		<category><![CDATA[stackoverflow]]></category>
		<category><![CDATA[survey]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=256</guid>
		<description><![CDATA[When analyzing a questionnaire, one often wants to view the correlation between two or more Likert questionnaire item&#8217;s (for example: two ordered categorical vectors ranging from 1 to 5). When dealing with several such Likert variable&#8217;s, a clear presentation of all the pairwise relation&#8217;s between our variable can be achieved by inspecting the (Spearman) correlation matrix (easily achieved in R by using the &#8220;cor.test&#8221; command on a matrix of variables). Yet, a challenge appears once we wish to plot this [...]]]></description>
			<content:encoded><![CDATA[<p>When analyzing a questionnaire, one often wants to view the correlation between two or more <a href="http://en.wikipedia.org/wiki/Likert_scale">Likert questionnaire</a> item&#8217;s (for example: two ordered categorical vectors ranging from 1 to 5).</p>
<p>When dealing with several such Likert variable&#8217;s, a clear presentation of all the pairwise relation&#8217;s between our variable can be achieved by inspecting the (Spearman) correlation matrix (easily achieved in R by using the &#8220;cor.test&#8221; command on a matrix of variables).<br />
Yet, a challenge appears once we wish to plot this correlation matrix.  The challenge stems from the fact that the classic presentation for a correlation matrix is a <strong>scatter plot matrix</strong> &#8211; but scatter plots don&#8217;t (usually) work well for ordered categorical vectors since the dots on the scatter plot often overlap each other.</p>
<p>There are four solution for the point-overlap problem that I know of:</p>
<ol>
<li>Jitter the data a bit to give a sense of the &#8220;density&#8221; of the points</li>
<li>Use a color spectrum to represent when a point actually represent &#8220;many points&#8221;</li>
<li>Use different points sizes to represent when there are &#8220;many points&#8221; in the location of that point</li>
<li>Add a LOWESS (or LOESS) line to the scatter plot &#8211; to show the trend of the data</li>
</ol>
<p>In this post I will offer the code for the  a solution that uses solution 3-4 (and possibly 2, please read this post comments). Here is the output (click to see a larger image):</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2010/04/scatter-plot-correlation-matrix.png"><img class="alignnone size-full wp-image-257" title="scatter plot correlation matrix" src="http://www.r-statistics.com/wp-content/uploads/2010/04/scatter-plot-correlation-matrix.png" alt="" width="550"/></a></p>
<p>And here is the code to produce this plot:</p>
<p><span id="more-256"></span></p>
<h3>R code for producing a Correlation scatter-plot matrix &#8211; for ordered-categorical data</h3>
<p><strong>Note</strong> that this code will work fine for continues data points (although I might suggest to enlarge the &#8220;point.size.rescale&#8221; parameter to something bigger then 1.5 in the &#8220;panel.smooth.ordered.categorical&#8221; function)</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
</pre></td><td class="code"><pre class="rsplus" style="font-family:monospace;"><span style="color: #228B22;"># -----------------</span>
<span style="color: #228B22;"># Functions</span>
<span style="color: #228B22;"># -----------------</span>
&nbsp;
panel.<span style="">cor</span>.<span style="">ordered</span>.<span style="">categorical</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x, y, digits<span style="color: #080;">=</span><span style="color: #ff0000;">2</span>, prefix<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;&quot;</span>, cex.<span style="">cor</span><span style="color: #080;">&#41;</span> 
<span style="color: #080;">&#123;</span>
&nbsp;
    usr <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;usr&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">;</span> <span style="color: #0000FF; font-weight: bold;">on.<span style="">exit</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
    <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">1</span>, <span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
&nbsp;
    r <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">abs</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">cor</span><span style="color: #080;">&#40;</span>x, y, method <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;spearman&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notive we use spearman, non parametric correlation here</span>
    r.<span style="">no</span>.<span style="">abs</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cor</span><span style="color: #080;">&#40;</span>x, y, method <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;spearman&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
&nbsp;
    txt <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">format</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>r.<span style="">no</span>.<span style="">abs</span> , <span style="color: #ff0000;">0.123456789</span><span style="color: #080;">&#41;</span>, digits<span style="color: #080;">=</span>digits<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span> 
    txt <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">paste</span><span style="color: #080;">&#40;</span>prefix, txt, sep<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;&quot;</span><span style="color: #080;">&#41;</span> 
    <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">missing</span><span style="color: #080;">&#40;</span>cex.<span style="">cor</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> cex <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">0.8</span><span style="color: #080;">/</span><span style="color: #0000FF; font-weight: bold;">strwidth</span><span style="color: #080;">&#40;</span>txt<span style="color: #080;">&#41;</span> 
&nbsp;
    test <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cor.<span style="">test</span></span><span style="color: #080;">&#40;</span>x,y, method <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;spearman&quot;</span><span style="color: #080;">&#41;</span> 
    <span style="color: #228B22;"># borrowed from printCoefmat</span>
    Signif <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">symnum</span><span style="color: #080;">&#40;</span>test$p.<span style="">value</span>, corr <span style="color: #080;">=</span> FALSE, na <span style="color: #080;">=</span> FALSE, 
                  cutpoints <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">0.001</span>, <span style="color: #ff0000;">0.01</span>, <span style="color: #ff0000;">0.05</span>, <span style="color: #ff0000;">0.1</span>, <span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>,
                  <span style="color: #0000FF; font-weight: bold;">symbols</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;***&quot;</span>, <span style="color: #ff0000;">&quot;**&quot;</span>, <span style="color: #ff0000;">&quot;*&quot;</span>, <span style="color: #ff0000;">&quot;.&quot;</span>, <span style="color: #ff0000;">&quot; &quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
&nbsp;
    <span style="color: #0000FF; font-weight: bold;">text</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0.5</span>, <span style="color: #ff0000;">0.5</span>, txt, cex <span style="color: #080;">=</span> cex <span style="color: #080;">*</span> r<span style="color: #080;">&#41;</span> 
    <span style="color: #0000FF; font-weight: bold;">text</span><span style="color: #080;">&#40;</span>.8, .8, Signif, cex<span style="color: #080;">=</span>cex, <span style="color: #0000FF; font-weight: bold;">col</span><span style="color: #080;">=</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span> 
<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
&nbsp;
&nbsp;
panel.<span style="">smooth</span>.<span style="">ordered</span>.<span style="">categorical</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span> <span style="color: #080;">&#40;</span>x, y, <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;col&quot;</span><span style="color: #080;">&#41;</span>, bg <span style="color: #080;">=</span> NA, pch <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;pch&quot;</span><span style="color: #080;">&#41;</span>, 
												cex <span style="color: #080;">=</span> <span style="color: #ff0000;">1</span>, col.<span style="">smooth</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;red&quot;</span>, span <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span><span style="color: #080;">/</span><span style="color: #ff0000;">3</span>, iter <span style="color: #080;">=</span> <span style="color: #ff0000;">3</span>, 
												point.<span style="">size</span>.<span style="">rescale</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.5</span>, ...<span style="color: #080;">&#41;</span> 
<span style="color: #080;">&#123;</span>
	<span style="color: #228B22;">#require(colorspace)</span>
    <span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">reshape</span><span style="color: #080;">&#41;</span>
    z <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">merge</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>x,y<span style="color: #080;">&#41;</span>, melt<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">table</span><span style="color: #080;">&#40;</span>x ,y<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,<span style="color: #0000FF; font-weight: bold;">sort</span> <span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span><span style="color: #080;">&#41;</span>$value
    <span style="color: #228B22;">#the.col &lt;- heat_hcl(length(x))[z]</span>
    z <span style="color: #080;">&lt;-</span> point.<span style="">size</span>.<span style="">rescale</span><span style="color: #080;">*</span>z<span style="color: #080;">/</span> <span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> <span style="color: #228B22;"># notice how we rescale the dots accourding to the maximum z could have gotten</span>
&nbsp;
    <span style="color: #0000FF; font-weight: bold;">symbols</span><span style="color: #080;">&#40;</span> x, y,  circles <span style="color: #080;">=</span> z,<span style="color: #228B22;">#rep(0.1, length(x)), #sample(1:2, length(x), replace = T) ,</span>
			inches<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">F</span>, bg<span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;grey&quot;</span>,<span style="color: #228B22;">#the.col ,</span>
			fg <span style="color: #080;">=</span> bg, add <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
&nbsp;
    <span style="color: #228B22;"># points(x, y, pch = pch, col = col, bg = bg, cex = cex)</span>
    ok <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">is.<span style="">finite</span></span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span> <span style="color: #080;">&amp;</span> <span style="color: #0000FF; font-weight: bold;">is.<span style="">finite</span></span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">if</span> <span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">any</span><span style="color: #080;">&#40;</span>ok<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span> 
        <span style="color: #0000FF; font-weight: bold;">lines</span><span style="color: #080;">&#40;</span>stats<span style="color: #080;">::</span><span style="color: #0000FF; font-weight: bold;">lowess</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#91;</span>ok<span style="color: #080;">&#93;</span>, y<span style="color: #080;">&#91;</span>ok<span style="color: #080;">&#93;</span>, f <span style="color: #080;">=</span> span, iter <span style="color: #080;">=</span> iter<span style="color: #080;">&#41;</span>, 
            <span style="color: #0000FF; font-weight: bold;">col</span> <span style="color: #080;">=</span> col.<span style="">smooth</span>, ...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
panel.<span style="">hist</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>x, ...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
    usr <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;usr&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">;</span> <span style="color: #0000FF; font-weight: bold;">on.<span style="">exit</span></span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">par</span><span style="color: #080;">&#40;</span>usr <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>usr<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>, <span style="color: #ff0000;">0</span>, <span style="color: #ff0000;">1.5</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#41;</span>
    h <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">hist</span><span style="color: #080;">&#40;</span>x, <span style="color: #0000FF; font-weight: bold;">plot</span> <span style="color: #080;">=</span> FALSE, br <span style="color: #080;">=</span> <span style="color: #ff0000;">20</span><span style="color: #080;">&#41;</span>
    breaks <span style="color: #080;">&lt;-</span> h$breaks<span style="color: #080;">;</span> nB <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">length</span><span style="color: #080;">&#40;</span>breaks<span style="color: #080;">&#41;</span>
    y <span style="color: #080;">&lt;-</span> h$counts<span style="color: #080;">;</span> y <span style="color: #080;">&lt;-</span> y<span style="color: #080;">/</span><span style="color: #0000FF; font-weight: bold;">max</span><span style="color: #080;">&#40;</span>y<span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">rect</span><span style="color: #080;">&#40;</span>breaks<span style="color: #080;">&#91;</span><span style="color: #080;">-</span>nB<span style="color: #080;">&#93;</span>, <span style="color: #ff0000;">0</span>, breaks<span style="color: #080;">&#91;</span><span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>, y, <span style="color: #0000FF; font-weight: bold;">col</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;orange&quot;</span>, ...<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
pairs.<span style="">ordered</span>.<span style="">categorical</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>xx,...<span style="color: #080;">&#41;</span>
		<span style="color: #080;">&#123;</span>
			<span style="color: #0000FF; font-weight: bold;">pairs</span><span style="color: #080;">&#40;</span>xx , 
					diag.<span style="">panel</span> <span style="color: #080;">=</span> panel.<span style="">hist</span> ,
					lower.<span style="">panel</span><span style="color: #080;">=</span>panel.<span style="">smooth</span>.<span style="">ordered</span>.<span style="">categorical</span>,
					upper.<span style="">panel</span><span style="color: #080;">=</span>panel.<span style="">cor</span>.<span style="">ordered</span>.<span style="">categorical</span>,
					cex.<span style="">labels</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">1.5</span>, ...<span style="color: #080;">&#41;</span> 
		<span style="color: #080;">&#125;</span>
&nbsp;
&nbsp;
&nbsp;
&nbsp;
<span style="color: #228B22;"># -----------------</span>
<span style="color: #228B22;"># Example</span>
<span style="color: #228B22;"># -----------------</span>
&nbsp;
<span style="color: #0000FF; font-weight: bold;">set.<span style="">seed</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">666</span><span style="color: #080;">&#41;</span>
a1 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sample</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">5</span>, <span style="color: #ff0000;">100</span>, <span style="color: #0000FF; font-weight: bold;">replace</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
a2 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sample</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">5</span>, <span style="color: #ff0000;">100</span>, <span style="color: #0000FF; font-weight: bold;">replace</span> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">T</span><span style="color: #080;">&#41;</span>
a3 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">round</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">jitter</span><span style="color: #080;">&#40;</span>a2, <span style="color: #ff0000;">7</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#41;</span>
	a3<span style="color: #080;">&#91;</span>a3 <span style="color: #080;">&lt;</span> <span style="color: #ff0000;">1</span> <span style="color: #080;">|</span> a3 <span style="color: #080;">&gt;</span> <span style="color: #ff0000;">5</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">3</span>
a4 <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">6</span><span style="color: #080;">-</span><span style="color: #0000FF; font-weight: bold;">round</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">jitter</span><span style="color: #080;">&#40;</span>a1, <span style="color: #ff0000;">7</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&#41;</span>
	a4<span style="color: #080;">&#91;</span>a4 <span style="color: #080;">&lt;</span> <span style="color: #ff0000;">1</span> <span style="color: #080;">|</span> a4 <span style="color: #080;">&gt;</span> <span style="color: #ff0000;">5</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">3</span>
&nbsp;
aa <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>a1,a2,a3, a4<span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">reshape</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;"># plotting :)		</span>
pairs.<span style="">ordered</span>.<span style="">categorical</span><span style="color: #080;">&#40;</span>aa<span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<h3> Credits: </h3>
<ul>
<li>The original R code for the correlation matrix plot was taken from <a href="http://addictedtor.free.fr/graphiques/graphcode.php?graph=137">R Graph Gallery</a> (The differences are: 1) The use of spearman correlation;  2) The adding of hist panel and;  3) The changing of points sizes</li>
<li>The idea to use symbols for changing the point sizes was <a href="http://stackoverflow.com/questions/2593643/correlation-scatter-matrix-plot-with-different-point-size-in-r">offered</a> by <a href="http://www.linkedin.com/pub/doug-y-barbo/2/356/416">Doug Y&#8217;barbo</a>.<br />
And also to<a href="http://dirk.eddelbuettel.com/"> Dirk Eddelbuettel </a>for offering to use cex (although I ended up not using that)</li>
</ul>
<p>If you got ideas on how to improve this code (or reproducing it with ggplot2 or lattice), please do so in the comments (or on your own blog, but be sure to let me know <img src='http://www.r-statistics.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />   )</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/04/correlation-scatter-plot-matrix-for-ordered-categorical-data/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>The &#8220;Future of Open Source&#8221; Survey &#8211; an R user&#8217;s thoughts and conclusions</title>
		<link>http://www.r-statistics.com/2010/03/the-future-of-open-source-survey-an-r-users-thoughts-and-conclusions/</link>
		<comments>http://www.r-statistics.com/2010/03/the-future-of-open-source-survey-an-r-users-thoughts-and-conclusions/#comments</comments>
		<pubDate>Tue, 23 Mar 2010 21:58:53 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[R community]]></category>
		<category><![CDATA[graphs]]></category>
		<category><![CDATA[Open source]]></category>
		<category><![CDATA[Open source software]]></category>
		<category><![CDATA[OSS]]></category>
		<category><![CDATA[R future]]></category>
		<category><![CDATA[survey]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=210</guid>
		<description><![CDATA[Over a month ago, David Smith published a call for people to participate in the &#8220;Future of Open Source&#8221; Survey. 550 people (and me) took the survey, and today I got an e-mail with the news that the 2010 survey results are analysed and where published in the &#8220;Future.Of.Open.Source blog&#8221; In the following (38 slides) presentation: I would like to thank Bryan House and anyone else who took part in making this survey, analyzing and publishing it&#8217;s results. The presentation [...]]]></description>
			<content:encoded><![CDATA[<p>Over a month ago, David Smith <a href="http://blog.revolution-computing.com/2010/02/future-of-open-source-survey.html">published a call for people</a> to participate in the <a href="http://futureofopensource.drupalgardens.com">&#8220;Future of Open Source&#8221; Survey</a>.  550 people (and me) took the survey, and today I got an e-mail with the news that the 2010 survey results are analysed and where published in the <a href="http://futureofopensource.drupalgardens.com/2010-survey-results">&#8220;Future.Of.Open.Source blog&#8221;</a> In the following (38 slides) presentation:</p>
<p><object width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=michaelskokfoos2010v3-1-finaltopublish-100317122955-phpapp02&#038;stripped_title=2010-future-of-open-source-survey-results" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=michaelskokfoos2010v3-1-finaltopublish-100317122955-phpapp02&#038;stripped_title=2010-future-of-open-source-survey-results" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object></p>
<p>I would like to thank Bryan House and anyone else who took part in making this survey, analyzing and publishing it&#8217;s results.</p>
<p>The presentation has left me with some thoughts and conclusions, I would like to share with you here.</p>
<p><span id="more-210"></span></p>
<p><strong><span style="text-decoration: underline;">Pre conclusions 1 &#8211; thoughts about the graphical/statistical presentation:</span></strong><br />
(p.s: all in good faith, please &#8211; no taking offense from anything I write.  And if you have anything to comment on &#8211; please enlighten me in the comments)</p>
<ul>
<li>(-1) For (most of) the uses of pie-charts instead of bar-plots (for more on that, see <a href="http://en.wikipedia.org/wiki/Pie_chart#Use.2C_effectiveness_and_visual_perception">Wikipedia on pie charts</a>)</li>
<li>(+1) For comparing previous years to current year.</li>
<li>(+1) For using different font weights (point sizes) to emphasize quantity on slide 12 (I found it useful)</li>
<li>(-1) After this presentation was made &#8220;<a href="http://markandrewgoetz.com/blog/index.php/2009/11/my-new-wallpaper/">Tufte killed another kitten</a>&#8220; (link hat-tip for letting me know about the image goes to <a href="http://blog.revolution-computing.com/2010/03/because-its-friday-kittens-beware-tufte.html">David of revolution&#8217;s blog</a>)</li>
<li>(+1) Good use of images!</li>
<li>(-2) For only presenting 1 dimensional analysis of the data</li>
</ul>
<p><strong><span style="text-decoration: underline;">Pre conclusions 2 - A plea for providing the source data for the Survey:</span></strong></p>
<p>My big hope is to see the <strong>release of the source data collected in the survey</strong> published so that other people (me <img src='http://www.r-statistics.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  ) will be able to analyse it.  &#8221;Setting the data free&#8221; as can be derived from O&#8217;Reilly&#8217;s keynote at OSBC conference, is a bit virtue.  Here&#8217;s a link to <a href="http://www.slideshare.net/timoreilly/open-source-in-the-cloud-computing-era">his talk slides</a>, and to <a href="http://blog.revolution-computing.com/2010/03/oreilly-at-osbc-the-futures-in-the-data.html">David&#8217;s wonderful notes </a>about that talk (A great read.)</p>
<p>And now for some (humble) conclusions from the survey.</p>
<p><strong><span style="text-decoration: underline;">Conclusion 1 &#8211;  Let&#8217;s invest in making the following of R extension even more scalable</span></strong></p>
<p>Slide 12 &#8211; people believe (now more then in previous years) that one of OSS attractive features are it&#8217;s rapid pace of innovation.</p>
<p>That&#8217;s good news for R, since R is known for that it gives more &#8220;up to date&#8221; statistical tools then any other statistical package in existence.  That is due to amazing community of statisticians and statistical programmers, coupled with a solid structure for creating <a href="http://cran.r-project.org/doc/manuals/R-exts.html">R extensions</a>.</p>
<p>But at the same time, there are several challenges in having open source innovation.</p>
<p>One such drawback is given by John Chambers on the subject in &#8220;<a href="http://journal.r-project.org/2009-1/RJournal_2009-1_Chambers.pdf">Facets of R</a>&#8221;  (A Special invited paper on “The Future of R”  - see page 3 section &#8220;Modular design and collaborative support&#8221;), and I quote:</p>
<blockquote><p>On the downside, a large collaborative enterprise with a general practice of making collective decisions has a natural tendency towards conservatism. Radical changes do threaten to break currently working features. The future benefits they might bring will often not be sufficiently persuasive. The very success of R, combined with its collaborative facet, poses a challenge to cultivate the “next big step” in software for data analysis.</p></blockquote>
<p>Another good discussion of this was made by John Fox in <a href="http://journal.r-project.org/archive/2009-2/RJournal_2009-2_Fox.pdf">Aspects of the Social Organization and Trajectory of the R Project</a>. <em>The R Journal</em>, 1(2):5-13, December 2009</p>
<p>Both authors reflect on how CRAN is having so many packages (extensions to R core).  While the diversity is wonderful, the scalability in the user&#8217;s ability to <strong>handle </strong>the variety is limited.  From a user&#8217;s perspective it is very hard to find/follow/manage all the innovative R extensions out there.  One hope for improvement in this front is the project &#8220;<a href="http://crantastic.org/">Crantastic</a>&#8220;, which I hope will get (much) more attention and expansion.  An optimistic news regarding the future of the project was <a href="http://dirk.eddelbuettel.com/blog/2010/03/18/#gsoc2010_and_r_is_in">published recently by Dirk Eddelbuettel</a> who shared with all of us about the <a href="http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010">open (R) projects in 2010 google summer of code</a>, two important projects (in this respect) are <a href="http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010:crantastic2">Crantastic2 </a>and <a href="http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2010:cran_stats">cran_stats</a>, which I hope will come through.</p>
<p><strong><span style="text-decoration: underline;">Conclusion 2 &#8211;  If you want R to spread &#8211; support open source in general</span></strong></p>
<p>slide 13 &#8211; shows that people believe that the there are 3 main drivers for the adaptation of OSS (such as R):</p>
<ol>
<li>Public sector adaptation &#8211; R is in the Universities &#8211; checked.</li>
<li>Private sector adaptation- R has a way to go here &#8211; but we are on the way</li>
<li>Past experience with OSS &#8211; my conclusion from this is that if you help promote any open source software, chances are you are also helping to promote R.</li>
</ol>
<p><strong><span style="text-decoration: underline;">Conclusion 3 &#8211;  get to know what &#8220;the cloud&#8221; can do for you!</span></strong></p>
<p>slide 24 &#8211; This year, 40% of the people answering the survey (twice as much in the past two years), said that Cloud computing is gonne have an impact on OSS vendors.  If you don&#8217;t know what you can do with R and the cloud, it might be time for you to learn the subject and see if you are not missing out on something.</p>
<p>Some of this year&#8217;s <a href="http://user2010.org/tutorials/index.html">tutorials on useR2010</a> conference, will talk about cloud computing and R:</p>
<ul>
<li><a href="http://user2010.org/tutorials/Zolot.html">Alex Zolot: Work with R on Amazon&#8217;s Cloud</a></li>
<li><a href="http://user2010.org/tutorials/Chine.html">Karim Chine: Elastic-R, a google docs-like portal for data analysis in the cloud</a></li>
<li><a href="http://user2010.org/tutorials/Eddelbuettel.html">Dirk Eddelbuettel: Introduction to high-performance computing with R</a> (connected, although not that directly)</li>
</ul>
<p>My current (humble) contribution to the subject is the post I recently published about <a href="http://www.r-statistics.com/2010/03/google-spreadsheets-google-forms-r-easily-collecting-and-importing-data-for-analysis/">How to use google forms with R to Easily collect and access data for analysis</a>.</p>
<p>* * *</p>
<p>I welcome any comments (or reply posts) on the subject. Please let me know what you think (of the survey results and on the points I brought up)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/03/the-future-of-open-source-survey-an-r-users-thoughts-and-conclusions/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Simple visualization of a 11X5 table (for WordPress 2.9 Features Vote Results)</title>
		<link>http://www.r-statistics.com/2009/07/simple-visualization-of-a-11x5-table-for-wordpress-2-9-features-vote-results/</link>
		<comments>http://www.r-statistics.com/2009/07/simple-visualization-of-a-11x5-table-for-wordpress-2-9-features-vote-results/#comments</comments>
		<pubDate>Fri, 31 Jul 2009 23:24:21 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[WordPress and Statistics]]></category>
		<category><![CDATA[opensource community]]></category>
		<category><![CDATA[survey]]></category>
		<category><![CDATA[tables]]></category>
		<category><![CDATA[visualization]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=20</guid>
		<description><![CDATA[I guess this is not the number one post I would like to start with on this blog, but I feel the time is right for it (community-wise). I&#8217;ll move on to the subject matter in a moment, but first a short intro: This blog is written by Tal Galili. I am an aspiring statistician who also loves to use R for his work. At the same time I am also a WordPress blogger, writing mainly at www.TalGalili.com where I [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><a href="http://www.flickr.com/photos/44742295@N00/2496109349"><img class="aligncenter" title="Simply Something Sophisicated - a WordPress poster" src="http://farm4.static.flickr.com/3063/2496109349_949c62ec2b_m.jpg" border="0" alt="Simply Something Sophisicated - a WordPress poster" hspace="5" /></a></p>
<p>I guess this is not the number one post I would like to start with on this blog, but I feel the time is right for it (community-wise).</p>
<p>I&#8217;ll move on to the subject matter in a moment, but first a short intro: This blog is written by Tal Galili. I am an aspiring statistician who also loves to use <a title="R" href="http://www.r-statistics.com/2009/03/what-is-r/">R</a> for his work. At the same time I am also a <a title="WordPress " href="http://en.wikipedia.org/wiki/WordPress">WordPress</a> blogger, writing mainly at <a href="http://www.TalGalili.com">www.TalGalili.com</a> where I can use my native language (Hebrew) for self expression.</p>
<p>This combination of <strong>statistics </strong>and <strong>blogging</strong> will lead me to sometimes much less statistical, but more Web/Open-Source oriented posts like this one. So for the statisticians in the audience I extend my apologies and invite you to wait for future posts which will be more fully focused on <strong>Statistics and R</strong>.</p>
<p>And now for the topic at hand. . .</p>
<p>*         *         *         *         *<br />
<span id="more-20"></span>I have just noticed the nice article published on the <a href="http://wordpress.org/development/">wordpress development blog</a> titled &#8220;<a href="http://wordpress.org/development/2009/07/2-9-vote-results/">2.9 Features Vote Results</a>&#8220;. The post exemplifies a wonderful trend in the WordPress community (led by <a style="text-decoration: none; color: #777777; font-weight: normal; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #dfdfdf;" href="http://jane.wordpress.com/">Jane Wells</a>) having to do with connecting between the core team and the WordPress user community. The way Jane does this is by giving surveys to WordPress users,  which in turn offers the WordPress core team an opportunity to understand the community needs.</p>
<p>In the post &#8220;<a href="http://wordpress.org/development/2009/07/2-9-vote-results/">2.9 Features Vote Results</a>&#8220;, Jane presented the results of such survey. The post had tables and barplots, but the barplots were only present for the one dimensional variables. In contrast, more elaborate data, such as that of question 2 (asking to rate each of 11 potential features on a scale of 1 to 5), was shown only with a table, such as this:</p>
<p><a href="http://wpdotorg.wordpress.com/files/2009/07/q2.png"><img class="alignnone size-medium wp-image-24" title="q2" src="http://www.r-statistics.com/wp-content/uploads/2009/07/q2-300x181.png" alt="q2" width="300" height="181" /></a></p>
<p>The table gives the full information (although I would love it if it was easily downloadable, instead of having to type in the numbers) &#8211; but its main limitation is that from a quick look, one can not easily get (let alone understand) anything.</p>
<p>For the goal of understanding more of the results with a quick glance, I offer two simple and well-known visualizations for the results.</p>
<p>1) Parallel barplots (click for bigger image)</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2009/07/q2-option1.png"><img class="alignnone size-medium wp-image-25" title="q2-option1" src="http://www.r-statistics.com/wp-content/uploads/2009/07/q2-option1-300x129.png" alt="q2-option1" width="300" height="129" /></a></p>
<p>This plot can be easily implemented in Excel (although I did it in R) and can allow us to compare the different ranking each potential feature received.</p>
<p>For example, this shows us that most answers were usually given rank 4 (&#8220;would be nice&#8221;) for each feature.</p>
<p>2) Mosaic plot (click for bigger image)</p>
<p><a href="http://www.r-statistics.com/wp-content/uploads/2009/07/q2-option2.png"><img class="alignnone size-medium wp-image-26" title="q2-option2" src="http://www.r-statistics.com/wp-content/uploads/2009/07/q2-option2-300x129.png" alt="q2-option2" width="300" height="129" /></a></p>
<p>I don&#8217;t know if this can be done in Excel, but with R it is just a simple line of code.</p>
<p>(<em>mosaicplot((DataSet.table), las = 1, col = c(&#8220;gray&#8221;,&#8221;gray&#8221;,&#8221;blue&#8221;,3,&#8221;dark green&#8221;), main = &#8220;&#8221;)</em>)</p>
<p>The advantage of this plot is that it allows us to compare the different features easily, while not only comparing the top rank, but also combining different rankings for easy comparison (for example, comparing how many rank 4 or 5 each feature received).</p>
<p>So for example, the plot shows me that the most ranked with number 5 was the feature &#8220;easier embeds&#8221; but the most ranked &#8220;number 4 or 5&#8243; was the feature &#8220;custom image sizes&#8221;. The feature &#8220;media album&#8221; came close to these two, but didn&#8217;t top either.</p>
<p><strong><span style="text-decoration: underline;">Conclusions from this post</span></strong>:</p>
<ol>
<li>It would be nice (if possible) to publish the full data of the surveys, not just the results.</li>
<li>The second question of the survey gives different answer than the first question. But since the difference in percentage seems to be so small compared to the other options, I would guess that all of the top 4 features are more or less in the same level of interest to the community.</li>
</ol>
<p>p.s. to Jane &#8211; why do none of the numbers in this table add up to 3406 (the number of respondants) ?</p>
<p>p.p.s.  to Jane and the Dev team &#8211; great work people!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2009/07/simple-visualization-of-a-11x5-table-for-wordpress-2-9-features-vote-results/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
