<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Clustergram: visualization and diagnostics for cluster analysis (R code)</title>
	<atom:link href="http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/</link>
	<description>Writing about statistics with R, and open source stuff (software, data, community)</description>
	<lastBuildDate>Fri, 10 Feb 2012 03:07:42 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Andreas Weidenhiller</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/comment-page-1/#comment-3967</link>
		<dc:creator>Andreas Weidenhiller</dc:creator>
		<pubDate>Tue, 21 Dec 2010 11:12:59 +0000</pubDate>
		<guid isPermaLink="false">http://www.r-statistics.com/?p=391#comment-3967</guid>
		<description>Hi! I like this function and have two suggestions:

1. changing the noise generation command in the clustergram function to
noise &lt;- unlist(tapply(line.width, clusters.vec, function(x){c=cumsum(x); c-mean(c)}))[order(seq_along(clusters.vec)[order(clusters.vec)])]
better arranges the lines around the center points

2. One use for color which I have found is this: If there is some grouping of the data which is known beforehand, one might want to mark the groups by these colors. (At least I did this - any one else interested?)</description>
		<content:encoded><![CDATA[<p>Hi! I like this function and have two suggestions:</p>
<p>1. changing the noise generation command in the clustergram function to<br />
noise &lt;- unlist(tapply(line.width, clusters.vec, function(x){c=cumsum(x); c-mean(c)}))[order(seq_along(clusters.vec)[order(clusters.vec)])]<br />
better arranges the lines around the center points</p>
<p>2. One use for color which I have found is this: If there is some grouping of the data which is known beforehand, one might want to mark the groups by these colors. (At least I did this &#8211; any one else interested?)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ben Haller</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/comment-page-1/#comment-3804</link>
		<dc:creator>Ben Haller</dc:creator>
		<pubDate>Fri, 17 Dec 2010 18:23:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.r-statistics.com/?p=391#comment-3804</guid>
		<description>I&#039;m liking using some alpha in the lines; I have a big dataset (300,000 points), and the alpha makes it easier to see how many lines go from one node to another.</description>
		<content:encoded><![CDATA[<p>I&#8217;m liking using some alpha in the lines; I have a big dataset (300,000 points), and the alpha makes it easier to see how many lines go from one node to another.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tal Galili</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/comment-page-1/#comment-3243</link>
		<dc:creator>Tal Galili</dc:creator>
		<pubDate>Mon, 20 Sep 2010 15:13:53 +0000</pubDate>
		<guid isPermaLink="false">http://www.r-statistics.com/?p=391#comment-3243</guid>
		<description>Hello Deryl,

The function presented in the post offers kmeans with it&#039;s default configurations.

From looking at
&lt;pre&gt;
?kmeans
&lt;/pre&gt;

You could find that in the details section there is the answer:

&lt;blockquote&gt;
The algorithm of Hartigan and Wong (1979) is used by default. Note that some authors use k-means to refer to a specific algorithm rather than the general method: most commonly the algorithm given by MacQueen (1967) but sometimes that given by Lloyd (1957) and Forgy (1965). The Hartigan–Wong algorithm generally does a better job than either of those, but trying several random starts is often recommended.&lt;/blockquote&gt;</description>
		<content:encoded><![CDATA[<p>Hello Deryl,</p>
<p>The function presented in the post offers kmeans with it&#8217;s default configurations.</p>
<p>From looking at</p>
<pre>
?kmeans
</pre>
<p>You could find that in the details section there is the answer:</p>
<blockquote><p>
The algorithm of Hartigan and Wong (1979) is used by default. Note that some authors use k-means to refer to a specific algorithm rather than the general method: most commonly the algorithm given by MacQueen (1967) but sometimes that given by Lloyd (1957) and Forgy (1965). The Hartigan–Wong algorithm generally does a better job than either of those, but trying several random starts is often recommended.</p></blockquote>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tal Galili</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/comment-page-1/#comment-3242</link>
		<dc:creator>Tal Galili</dc:creator>
		<pubDate>Mon, 20 Sep 2010 15:11:04 +0000</pubDate>
		<guid isPermaLink="false">http://www.r-statistics.com/?p=391#comment-3242</guid>
		<description>Hello Allan,  Thank you for the offer.  Personally, I&#039;d prefer to allow a different solution of the function for different runs for (lazy) evaluation of the stability of the clustering solution.

BTW, you could also insure to always get the same result by using 

&lt;pre lang = &quot;rsplus&quot;&gt;
some.number &lt;- 666
set.seed(some.number)
&lt;/pre&gt;

Cheers,
Tal</description>
		<content:encoded><![CDATA[<p>Hello Allan,  Thank you for the offer.  Personally, I&#8217;d prefer to allow a different solution of the function for different runs for (lazy) evaluation of the stability of the clustering solution.</p>
<p>BTW, you could also insure to always get the same result by using </p>
<pre lang = "rsplus">
some.number < - 666
set.seed(some.number)
</pre>
<p>Cheers,<br />
Tal</pre>
]]></content:encoded>
	</item>
	<item>
		<title>By: Deryl</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/comment-page-1/#comment-3241</link>
		<dc:creator>Deryl</dc:creator>
		<pubDate>Mon, 20 Sep 2010 15:05:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.r-statistics.com/?p=391#comment-3241</guid>
		<description>Thanks for the email. I appreciate your response. Our team has addressed how to handle the missing values and we are looking over the resulting clustergrams. We are writing up a presentation for an upcoming conference, and so please entertain another query, if you don&#039;t mind: Please let me know, what kind of kmeans algorithm does R implement here? For example, in Matthias Schonlau&#039;s 2004 article (Computational Statistics, 19:95-111), he displays several clustergrams using SAS, Stata, and Splus implementations of random or deterministic kmeans algorithms (pp. 102-103). Which would be most similar to what is going on here with your R code? Thanks again for your insight.</description>
		<content:encoded><![CDATA[<p>Thanks for the email. I appreciate your response. Our team has addressed how to handle the missing values and we are looking over the resulting clustergrams. We are writing up a presentation for an upcoming conference, and so please entertain another query, if you don&#8217;t mind: Please let me know, what kind of kmeans algorithm does R implement here? For example, in Matthias Schonlau&#8217;s 2004 article (Computational Statistics, 19:95-111), he displays several clustergrams using SAS, Stata, and Splus implementations of random or deterministic kmeans algorithms (pp. 102-103). Which would be most similar to what is going on here with your R code? Thanks again for your insight.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Deryl</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/comment-page-1/#comment-3236</link>
		<dc:creator>Deryl</dc:creator>
		<pubDate>Sun, 19 Sep 2010 06:38:19 +0000</pubDate>
		<guid isPermaLink="false">http://www.r-statistics.com/?p=391#comment-3236</guid>
		<description>Your presentation is very clear, even for someone with only basic knowledge of statistical programs and syntax. I have a very large data set (350,000+ observations) of national educational survey data that I need to cluster, and this implementation looks promising as a way to decide the number of clusters. It worked well for me using a small subset of the data for testing. But, how might I be able to work with missing data? When I include a larger sample data set with records with missing values, I get errors returned. Are there any solutions you could suggest? Any methods for filling in missing data that I should avoid? Thanks for any insight you can offer.</description>
		<content:encoded><![CDATA[<p>Your presentation is very clear, even for someone with only basic knowledge of statistical programs and syntax. I have a very large data set (350,000+ observations) of national educational survey data that I need to cluster, and this implementation looks promising as a way to decide the number of clusters. It worked well for me using a small subset of the data for testing. But, how might I be able to work with missing data? When I include a larger sample data set with records with missing values, I get errors returned. Are there any solutions you could suggest? Any methods for filling in missing data that I should avoid? Thanks for any insight you can offer.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Silvana</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/comment-page-1/#comment-2758</link>
		<dc:creator>Silvana</dc:creator>
		<pubDate>Thu, 17 Jun 2010 20:37:13 +0000</pubDate>
		<guid isPermaLink="false">http://www.r-statistics.com/?p=391#comment-2758</guid>
		<description>I appreciate the initiative you took to create this pattern in R.
 I need to present a paper on this subject at university.
Do you know if the generation of clustergram is already implemented in R? Where can I find the mathematical calculations that are used in Clustergram?

Thanks,
Silvana</description>
		<content:encoded><![CDATA[<p>I appreciate the initiative you took to create this pattern in R.<br />
 I need to present a paper on this subject at university.<br />
Do you know if the generation of clustergram is already implemented in R? Where can I find the mathematical calculations that are used in Clustergram?</p>
<p>Thanks,<br />
Silvana</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tal Galili</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/comment-page-1/#comment-2754</link>
		<dc:creator>Tal Galili</dc:creator>
		<pubDate>Thu, 17 Jun 2010 04:51:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.r-statistics.com/?p=391#comment-2754</guid>
		<description>Cheers Ali :)

And for the others, 
Martin refers to me complimenting his post here:
http://spiltmartini.com/2010/05/20/infographics/</description>
		<content:encoded><![CDATA[<p>Cheers Ali <img src='http://www.r-statistics.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>And for the others,<br />
Martin refers to me complimenting his post here:<br />
<a href="http://spiltmartini.com/2010/05/20/infographics/" rel="nofollow">http://spiltmartini.com/2010/05/20/infographics/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ali Martin</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/comment-page-1/#comment-2753</link>
		<dc:creator>Ali Martin</dc:creator>
		<pubDate>Thu, 17 Jun 2010 00:07:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.r-statistics.com/?p=391#comment-2753</guid>
		<description>Thanks for the comment on the infographic. I wish I could take credit for it. I don&#039;t know much about clustergrams, but clustergram 6 reminds me of the brachial plexus (the plexus of nerves underneath the collarbone).</description>
		<content:encoded><![CDATA[<p>Thanks for the comment on the infographic. I wish I could take credit for it. I don&#8217;t know much about clustergrams, but clustergram 6 reminds me of the brachial plexus (the plexus of nerves underneath the collarbone).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tal Galili</title>
		<link>http://www.r-statistics.com/2010/06/clustergram-visualization-and-diagnostics-for-cluster-analysis-r-code/comment-page-1/#comment-2750</link>
		<dc:creator>Tal Galili</dc:creator>
		<pubDate>Wed, 16 Jun 2010 08:38:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.r-statistics.com/?p=391#comment-2750</guid>
		<description>Hello Hadley,
Thank you for the code and suggestions.
I extended this post to link to the ggplot2 implementation you wrote.

With regards,
Tal</description>
		<content:encoded><![CDATA[<p>Hello Hadley,<br />
Thank you for the code and suggestions.<br />
I extended this post to link to the ggplot2 implementation you wrote.</p>
<p>With regards,<br />
Tal</p>
]]></content:encoded>
	</item>
</channel>
</rss>

