<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>R-statistics blog &#187; r tips</title>
	<atom:link href="http://www.r-statistics.com/tag/r-tips/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.r-statistics.com</link>
	<description>Writing about statistics with R, and open source stuff (software, data, community)</description>
	<lastBuildDate>Mon, 30 Jan 2012 07:45:09 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Managing a statistical analysis project &#8211; guidelines and best practices</title>
		<link>http://www.r-statistics.com/2010/09/managing-a-statistical-analysis-project-guidelines-and-best-practices/</link>
		<comments>http://www.r-statistics.com/2010/09/managing-a-statistical-analysis-project-guidelines-and-best-practices/#comments</comments>
		<pubDate>Thu, 30 Sep 2010 16:03:12 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[best practices]]></category>
		<category><![CDATA[code management]]></category>
		<category><![CDATA[r tips]]></category>
		<category><![CDATA[statistical analysis]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=556</guid>
		<description><![CDATA[In the past two years, a growing community of R users (and statisticians in general) have been participating in two major Question-and-Answer websites: The R tag page on Stackoverflow, and Stat over flow (which will soon move to a new domain, no worries, I&#8217;ll write about it once it happens) In that time, several long (and fascinating) [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/09/managing-a-statistical-analysis-project-guidelines-and-best-practices/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/09/managing-a-statistical-analysis-project-guidelines-and-best-practices/"></g:plusone></div></div><p>In the past two years, a growing community of R users (and statisticians in general) have been participating in two major Question-and-Answer websites:</p>
<ol>
<li><a href="http://stackoverflow.com/questions/tagged/r">The R tag page on Stackoverflow</a>, and</li>
<li><a href="http://stats.stackexchange.com/">Stat over flow</a> (which will soon move to a new domain, no worries, I&#8217;ll write about it once it happens)</li>
</ol>
<p>In that time, several long (and fascinating) discussion threads where started, reflecting on tips and best practices for managing a statistical analysis project.  They are:</p>
<ul>
<li><a href="http://stackoverflow.com/questions/1429907/workflow-for-statistical-analysis-and-report-writing">&#8220;Workflow for statistical analysis and report writing&#8221;</a></li>
<li><a href="http://stackoverflow.com/questions/2284446/organizing-r-source-code">&#8220;Organizing R Source Code&#8221;</a></li>
<li><a href="http://stackoverflow.com/questions/1266279/how-to-organize-large-r-programs">&#8220;How to organize large R programs?&#8221;</a></li>
<li><a href="http://stackoverflow.com/questions/2712421/r-and-version-control-for-the-solo-data-analyst">&#8220;R and version control for the solo data analyst&#8221;</a></li>
<li><a href="http://stackoverflow.com/questions/2295389/how-does-software-development-compare-with-statistical-programming-analysis">&#8220;How does software development compare with statistical programming/analysis ?&#8221;</a></li>
<li><a href="http://stackoverflow.com/questions/2286831/how-do-you-combine-revision-control-with-workflow-for-r">&#8220;How do you combine “Revision Control” with “WorkFlow” for R?&#8221;</a></li>
<li><a href="http://stats.stackexchange.com/questions/2910/how-to-efficiently-manage-a-statistical-analysis-project">How to efficiently manage a statistical analysis project?</a></li>
</ul>
<p>On the last thread in the list, the user <a href="http://stats.stackexchange.com/users/930/chl">chl</a>, has started with trying to <a href="http://stats.stackexchange.com/questions/2910/how-to-efficiently-manage-a-statistical-analysis-project/3191#3191">compile all the tips and suggestions</a> together.  And with his permission, I am now republishing it here.  I encourage you to contribute from your own experience (either in the comments, or by answering to any of the threads I&#8217;ve linked to)</p>
<p><span id="more-556"></span></p>
<p>From here on is what &#8220;chl&#8221; wrote:</p>
<p><span style="font-size: 13.1944px;">These guidelines where compiled from </span><span style="font-size: 13.1944px;"><a rel="nofollow" href="http://www.stackoverflow.com/">SO</a> (as suggested by @Shane), <a rel="nofollow" href="http://biostar.stackexchange.com/">Biostar</a> (hereafter, BS), and <a href="http://stats.stackexchange.com/">SE</a>. I tried my best to acknowledge ownership for each item, and to select first or highly upvoted answer. I also added things of my own, and flagged items that are specific to the [<a href="http://www.r-project.org/">R</a>] environment.</span></p>
<p><strong>Data management</strong></p>
<ul>
<li>create a project structure for keeping all things at the right place (data, code, figures, etc., <a rel="nofollow" href="http://biostar.stackexchange.com/questions/822/how-do-you-manage-your-files-directories-for-your-projects/826#826">giovanni</a>/BS)</li>
<li>never modify raw data files (ideally, they should be read-only), copy/rename to new ones when making transformations, cleaning, etc.</li>
<li>check data consistency (<a href="http://stats.stackexchange.com/questions/2768/what-is-a-consistency-check/2785#2785">whuber</a> /SE)</li>
</ul>
<p><strong>Coding</strong></p>
<ul>
<li>organize source code in logical units or building blocks (<a href="http://stackoverflow.com/questions/1429907/workflow-for-statistical-analysis-and-report-writing/1434424#1434424">Josh Reich</a>/<a href="http://stackoverflow.com/questions/1429907/workflow-for-statistical-analysis-and-report-writing/1430569#1430569">hadley</a>/<a href="http://stackoverflow.com/questions/1266279/how-to-organize-large-r-programs/1269808#1269808">ars</a> /SO; <a rel="nofollow" href="http://biostar.stackexchange.com/questions/822/how-do-you-manage-your-files-directories-for-your-projects/826#826">giovanni</a>/<a rel="nofollow" href="http://biostar.stackexchange.com/questions/822/how-do-you-manage-your-files-directories-for-your-projects/829#829">Khader Shameer</a> /BS)</li>
<li>separate source code from editing stuff, especially for large project &#8212; partly overlapping with previous item and reporting</li>
<li>document everything, with e.g. [R]oxygen (<a href="http://stackoverflow.com/questions/2284446/organizing-r-source-code/2284486#2284486">Shane</a> /SO) or consistent self-annotation in the source file</li>
<li>[R] custom functions can be put in a dedicated file (that can be sourced when necessary), in a new environment (so as to avoid populating the top-level namespace, <a href="http://stackoverflow.com/questions/1266279/how-to-organize-large-r-programs/1319786#1319786">Brendan OConnor</a> /SO), or a package (<a href="http://stackoverflow.com/questions/1266279/how-to-organize-large-r-programs/1266400#1266400">Dirk Eddelbuettel</a>/<a href="http://stackoverflow.com/questions/2284446/organizing-r-source-code/2284486#2284486">Shane</a> /SO)</li>
</ul>
<p><strong>Analysis</strong></p>
<ul>
<li>don&#8217;t forget to set/record the seed you used when calling RNG or stochastic algorithms (e.g. k-means)</li>
<li>for Monte Carlo studies, it may be interesting to store specs/parameters in a separate file (<a rel="nofollow" href="http://neuralensemble.org/trac/sumatra">sumatra</a>may be a good candidate, <a rel="nofollow" href="http://biostar.stackexchange.com/questions/822/how-do-you-manage-your-files-directories-for-your-projects/826#826">giovanni</a> /BS)</li>
<li>don&#8217;t limit yourself to one plot per variable, use multivariate (Trellis) displays and interactive visualization tools (e.g. GGobi)</li>
</ul>
<p><strong>Versioning</strong></p>
<ul>
<li>use some kind of CVS for easy tracking/export, e.g. Git (<a href="http://stackoverflow.com/questions/2712421/r-and-version-control-for-the-solo-data-analyst/2715569#2715569">Sharpie</a>/<a href="http://stackoverflow.com/questions/2545765/how-can-i-email-someone-a-git-repository/2545784#2545784">VonC</a>/<a href="http://stackoverflow.com/questions/2286831/how-do-you-combine-revision-control-with-workflow-for-r/2290194#2290194">JD Long</a> /SO) &#8212; this follows from nice questions asked by @Jeromy and @Tal</li>
<li>backup everything, on a regular basis (<a href="http://stackoverflow.com/questions/2712421/r-and-version-control-for-the-solo-data-analyst/2715569#2715569">Sharpie</a>/<a href="http://stackoverflow.com/questions/2286831/how-do-you-combine-revision-control-with-workflow-for-r/2290194#2290194">JD Long</a> /SO)</li>
<li>keep a log of your ideas, or rely on an issue tracker, like <a rel="nofollow" href="http://ditz.rubyforge.org/ditz/">ditz</a> (<a rel="nofollow" href="http://biostar.stackexchange.com/questions/822/how-do-you-manage-your-files-directories-for-your-projects/826#826">giovanni</a> /BS) &#8212; partly redundant with the previous item since it is available in Git</li>
</ul>
<p><strong>Editing/Reporting</strong></p>
<ul>
<li>[R] Sweave (<a href="http://stackoverflow.com/questions/1429907/workflow-for-statistical-analysis-and-report-writing/1430013#1430013">Matt Parker</a> /SO)</li>
<li>[R] brew (<a href="http://stackoverflow.com/questions/1429907/workflow-for-statistical-analysis-and-report-writing/1436809#1436809">Shane</a> /SO)</li>
<li>[R] [R2HTML]<a rel="nofollow" href="http://cran.r-project.org/web/packages/R2HTML/index.html">20</a> or <a rel="nofollow" href="http://cran.r-project.org/web/packages/ascii/index.html">ascii</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/09/managing-a-statistical-analysis-project-guidelines-and-best-practices/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
		<item>
		<title>A nice link: &#8220;Some hints for the R beginner&#8221;</title>
		<link>http://www.r-statistics.com/2010/03/nice-link-some-hints-for-the-r-beginner/</link>
		<comments>http://www.r-statistics.com/2010/03/nice-link-some-hints-for-the-r-beginner/#comments</comments>
		<pubDate>Sun, 07 Mar 2010 20:47:31 +0000</pubDate>
		<dc:creator>Tal Galili</dc:creator>
				<category><![CDATA[R]]></category>
		<category><![CDATA[R links]]></category>
		<category><![CDATA[hints]]></category>
		<category><![CDATA[link]]></category>
		<category><![CDATA[r help]]></category>
		<category><![CDATA[r tips]]></category>
		<category><![CDATA[R tutorial]]></category>
		<category><![CDATA[tutorial]]></category>

		<guid isPermaLink="false">http://www.r-statistics.com/?p=189</guid>
		<description><![CDATA[Patrick Burns just posted to the mailing list the following massage: There is now a document called &#8220;Some hints for the R beginner&#8221; whose purpose is to get people up and running with R as quickly as possible. Direct access to it is: http://www.burns-stat.com/pages/Tutor/hints_R_begin.html JRR Tolkien wrote a story (sans hobbits) called &#8216;Leaf by Niggle&#8217; [...]]]></description>
			<content:encoded><![CDATA[<div class="socialize-in-content" style="float:right;"><div class="socialize-in-button socialize-in-button-right"><iframe src="http://www.facebook.com/plugins/like.php?href=http://www.r-statistics.com/2010/03/nice-link-some-hints-for-the-r-beginner/&amp;layout=box_count&amp;show_faces=false&amp;width=50&amp;action=like&amp;font=arial&amp;colorscheme=light&amp;height=65" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:50px !important; height:65px;" allowTransparency="true"></iframe></div><div class="socialize-in-button socialize-in-button-right"><g:plusone size="tall" href="http://www.r-statistics.com/2010/03/nice-link-some-hints-for-the-r-beginner/"></g:plusone></div></div><p>Patrick Burns just posted to the mailing list the following massage:</p>
<blockquote><p>There is now a document called &#8220;Some hints for the R beginner&#8221; whose purpose is to get people up and running with R as quickly as possible.</p>
<p>Direct access to it is:<br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html">http://www.burns-stat.com/pages/Tutor/hints_R_begin.html</a></p>
<p>JRR Tolkien wrote a story (sans hobbits) called &#8216;Leaf by Niggle&#8217; that has always resonated with me.  I offer you an imperfect, incomplete tree (but my roof is intact).</p>
<p>Suggestions for improvements are encouraged.</p></blockquote>
<p>And here is the link tree for the document (for your easy reviewing of the offered content) :</p>
<p><span id="more-189"></span></p>
<p><strong>This page has several sections, they can be put into the four categories: General, Objects, Actions, Help.</strong></p>
<p><strong>General<br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#intro">Introduction </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#blankscreen">Blank screen syndrome </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#langs">Misconceptions because of a previous language </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#compenv">Helpful computer environments </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#jargon">R vocabulary </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#epilogue">Epilogue </a></strong></p>
<p><strong>Objects<br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#keyobjects">Key objects </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#readingdata">Reading data into R </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#seeingobjects">Seeing objects </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#savingobjects">Saving objects </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#magic">Magic functions, magic objects </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#filetypes">Some file types </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#packages">Packages </a></p>
<p>Actions<br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#startup">What happens at R startup </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#keyactions">Key actions </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#errors">Errors and such </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#graphics">Graphics </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#vectorize">Vectorization </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#makemistakes">Make mistakes on purpose </a></p>
<p></strong></p>
<p><strong>Help<br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#readhelp">How to read a help file </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#search">Searching for functionality </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#documents">Some other documents </a><br />
<a href="http://www.burns-stat.com/pages/Tutor/hints_R_begin.html#Rhelp">R-help mailing list </a></strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.r-statistics.com/2010/03/nice-link-some-hints-for-the-r-beginner/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

