Tag Archives: R community

Top 20 R posts of 2011 (and some R-bloggers statistics)

R-bloggers.com is now two years young. The site is an (unofficial) online R journal written by bloggers who agreed to contribute their R articles to the site.
In this post I wish to celebrate R-bloggers’ second birthmounth by sharing with you:

  1. Links to the top 20 posts of 2011
  2. Statistics on “how well” R-bloggers did this year
  3. An invitation for sponsors/supporters to help keep the site alive

1. Top 24 R posts of 2011

R-bloggers’ success is largely owed to the content submitted by the R bloggers themselves.  The R community currently has almost 300 active R bloggers (links to the blogs are clearly visible in the right navigation bar on the R-bloggers homepage).  In the past year, these bloggers wrote over 2800 posts about R.

Here is a list of the top visited posts on the site in 2011:

  1. How much of r is written in r
  2. Cpu and gpu trends over time
  3. Select operations on r data frames
  4. Getting started with sweave r latex eclipse statet texlipse
  5. Delete rows from r data frame
  6. Amanda cox on how the new york times graphics department uses r
  7. Hipster programming languages
  8. Opendata r google easy maps
  9. New r generated video has stackoverflow posting behavior changed over time
  10. SNA visualising an email box with r
  11. 100 prisoners 100 lines of code
  12. Google ai challenge languages used by the best programmers
  13. Basics on markov chain for parents
  14. Top 10 algorithms in data mining
  15. A million random digits review of reviews
  16. Character occurrence in passwords
  17. Setting graph margins in r using the par function and lots of cow milk
  18. The new r compiler package in r 2 13 0 some first experiments
  19. Tutorial principal components analysis pca in r
  20. Making guis using c and r with the help of r net

2. Statistics – how well did R-bloggers do this year

There are several matrices one can consider when evaluating the success of a website.  I’ll present a few of them here and will begin by talking about the visitors to the site.

This year, the site was visited by over 665,000 “Unique Visitors.”  There was a total of over 1.4 million visits and over 2.8 million page-views.  People have surfed the site from over 200 countries, with the greatest number of visitors coming from the United States (~40%) and then followed by the United Kingdom (6.9%), Germany (6.6%), Canada (4.7%), France (3.3%), and other countries.

The site has received between 15,000 to 45,000 visits a week in the past few months, and I suspect this number will remain stable in the next few months (unless something very interesting will happen).

I believe this number will stay constant thanks to visitors’ loyalty: 55% of the site’s visits came from returning users.

Another indicator of reader loyalty is the number of subscribers to R-bloggers as counted by feedburner, which includes both RSS readers and e-mail subscribers.  The range of subscribers is estimated to be between 5600 to 5900.

Thus, I am very happy to see that R-bloggers continues to succeed in offering a real service to the global R users community.

3. Invitation to sponsor/advertise on R-bloggers

This year I was sadly accused by google adsense of click fraud (which I did not do, but have no way of proving my innocence).  Therefor, I am no longer able to use google adsense to sustain R-bloggers high monthly bills, and I turned to rely on direct  sponsoring of R-bloggers.

If you are interested in sponsoring/placing-ads/supporting R-bloggers, then you are welcome to contact me.

Happy new year!
Yours,
Tal Galili

UseR! 2011 slides and videos – on one page

I was recently reminded that the wonderful team at warwick University made sure to put online many of the slides (and some videos) of talks from the recent useR 2011 conference.  You can browse through the talks by going between the timetables (where it will be the most updated, if more slides will be added later), but I thought it might be more convenient for some of you to have the links to all the talks (with slides/videos) in one place.

I am grateful for all of the wonderful people who put their time in making such an amazing event (organizers, speakers, attendees), and also for the many speakers who made sure to share their talk/slides online for all of us to reference.  I hope to see this open-slides trend will continue in the upcoming useR conferences…

Bellow are all the links:

Tuesday 16th August

09:50 – 10:50

Kaleidoscope Ia, MS.03, Chair: Dieter Menne
Claudia BeleitesSpectroscopic Data in R and Validation of Soft Classifiers: Classifying Cells and Tissues by Raman Spectroscopy[Slides]
Jonathan RosenblattRevisiting Multi-Subject Random Effects in fMRI[Slides]
Zoe HoarePutting the R into Randomisation[Slides]
Kaleidoscope Ib, MS.01, Chair: Simon Urbanek
Markus GesmannUsing the Google Visualisation API with R[Slides]
Kaleidoscope Ic, MS.02, Chair: Achim Zeileis
David SmithThe R Ecosystem[Slides]
E. James HarnerRc2: R collaboration in the cloud[Slides]

11:15 – 12:35

Portfolio Management, B3.02, Chair: Patrick Burns
Jagrata MinardiR in the Practice of Risk Management Today[Slides]
Bioinformatics and High-Throughput Data, B3.03, Chair: Hervé Pagès
Thierry OnkelinxAFLP: generating objective and repeatable genetic data[Slides]
High Performance Computing, MS.03, Chair: Stefan Theussl
Willem LigtenbergGPU computing and R[Slides]
Manuel QuesadaOBANSoft: integrated software for Bayesian statistics and high performance computing with R[Slides]
Reporting Technologies and Workflows, MS.01, Chair: Martin Mächler
Andreas LehaThe Emacs Org-mode: Reproducible Research and Beyond[Slides]
Teaching, MS.02, Chair: Jay G. Kerns
Ian HollidayTeaching Statistics to Psychology Students using Reproducible Computing package RC and supporting Peer Review Framework[Slides]
Achim ZeileisAutomatic generation of exams in R[Slides]

14:00 – 14:45

Invited Talk, MS.01/MS.02, Chair: David Firth
Ulrike GrömpingDesign of Experiments in R[Slides] [Video]

14:45 – 15:30

Invited Talk, MS.01/MS.02, Chair: David Firth
Jonathan RougierNomograms for visualising relationships between three variables[Slides] [Video]

16:00 – 17:00

Modelling Systems and Networks, B3.02, Chair: Jonathan Rougier
Rachel OxladeAn S4 Object structure for emulation – the approximation of complex functions[Slides]
Christophe DutangComputation of generalized Nash equilibria[Slides]
Visualisation, MS.04, Chair: Antony Unwin
Andrej BlejecanimatoR: dynamic graphics in R[Slides]
Richard M. HeibergerGraphical Syntax for Structables and their Mosaic Plots[Slides]
Dimensionality Reduction and Variable Selection, MS.01, Chair: Matthias Schmid
Marie ChaventClustOfVar: an R package for the clustering of variables[Slides]
Jürg SchelldorferVariable Screening and Parameter Estimation for High-Dimensional Generalized Linear Mixed Models Using l1-Penalization[Slides]
Benjamin HofnergamboostLSS: boosting generalized additive models for location, scale and shape[Slides]
Business Management, MS.02, Chair: Enrico Branca
Marlene S. MarchenaSCperf: An inventory management package for R[Slides]
Pairach PiboonrungrojUsing R to test transaction cost measurement for supply chain relationship: A structural equation model[Slides]
Fabrizio OrtolaniIntegrating R and Excel for automatic business forecasting

17:05 – 18:05

Lightning Talks(see bellow)

Lightning Talks

  • Community and Communication, MS.02, Chair: Ashley Ford
    • George Zhang: China R user conference [Slides]
    • Tal Galili: Blogging and R – present and future [Link]
    • Markus Schmidberger: Get your R application onto a powerful and fully-configured Cloud Computing environment in less than 5 minutes. [Slides]
    • Eirini Koutoumanou: Teaching R to Non Package Literate Users [Slides]
    • Randall Pruim: Teaching Statistics using the mosaic Package [Slides]
  • Statistics and Programming, MS.01, Chair: Elke Thönnes
    • Toby Dylan Hocking: Fast, named capture regular expressions in R2.14 [Slides]
    • John C. Nash: Developments in optimization tools for R [Slides]
    • Christophe Dutang: A Unified Approach to fit probability distributions [Slides]
  • Package Showcase, MS.03, Chair: Jennifer Rogers
    • James Foadi: cRy: statistical applications in macromolecular crystallography [Slides]
    • Emilio López: Six Sigma is possible with R [Slides]
    • Jonathan Clayden: Medical image processing with TractoR [Slides]
    • Richard A. Bilonick: Using merror 2.0 to Analyze Measurement Error and Determine Calibration Curves [Slides]

Wednesday 17th August

09:00 – 09:50

Invited Talk, MS.01/MS.02, Chair: Ioannis Kosmidis
Lee E. EdlefsenScalable Data Analysis in R[Slides] [Video]

11:15 – 12:35

Spatio-Temporal Statistics, B3.02, Chair: Julian Stander
Nikolaus UmlaufStructured Additive Regression Models: An R Interface to BayesX[Slides]
Molecular and Cell Biology, B3.03, Chair: Andrea Foulkes
Matthew NunesSummary statistics selection for ABC inference in R[Slides]
Maarten van ItersonPower and minimal sample size for multivariate analysis of microarrays[Slides]
Mixed Effect Models, MS.03, Chair: Douglas Bates
Ulrich HalekohKenward-Roger modification of the F-statistic for some linear mixed models fitted with lmer[Slides]
Marco Geracilqmm: Estimating Quantile Regression Models for Independent and Hierarchical Data with R[Slides]
Kenneth KnoblauchMixed-effects Maximum Likelihood Difference Scaling[Slides]
Programming, MS.01, Chair: Uwe Ligges
Ray BrownriggTricks and Traps for Young Players[Slides]
Friedrich SchusterSoftware design patterns in R[Slides]
Patrick BurnsRandom input testing with R[Slides]
Data Mining Applications, MS.02, Chair: Przemysaw Biecek
Stephan StahlschmidtPredicting the offender’s age
Daniel ChapskyLeveraging Online Social Network Data and External Data Sources to Predict Personality[Slides]

14:45 – 15:30

Invited Talk, MS.01/MS.02, Chair: John Aston
Brandon WhitcherQuantitative Medical Image Analysis[Slides] [Video]

16:00 – 17:00

Development of R, B3.02, Chair: John C. Nash
Andrew R. RunnallsInterpreter Internals: Unearthing Buried Treasure with CXXR[Slides]
Geospatial Techniques, B3.03, Chair: Roger Bivand
Binbin LuConverting a spatial network to a graph in R[Slides]
Rainer M KrugSpatial modelling with the R-GRASS Interface[Slides]
Daniel Nüstsos4R – Accessing SensorWeb Data from R[Slides]
Genomics and Bioinformatics, MS.03, Chair: Ramón Diaz-Uriarte
Sebastian GibbMALDIquant: Quantitative Analysis of MALDI-TOF Proteomics Data[Slides]
Regression Modelling, MS.01, Chair: Cristiano Varin
Bettina GrünBeta Regression: Shaken, Stirred, Mixed, and Partitioned[Slides]
Rune Haubo B. ChristensenRegression Models for Ordinal Data: Introducing R-package ordinal[Slides]
Giuseppe BrunoMultiple choice models: why not the same answer? A comparison among LIMDEP, R, SAS and Stata[Slides]
R in the Business World, MS.02, Chair: David Smith
Derek McCrae NortonOdysseus vs. Ajax: How to build an R presence in a corporate SAS environment[Slides]

17:05 – 18:05

Hydrology and Soil Science, B3.02, Chair: Thomas Petzoldt
Wayne JonesGWSDAT (GroundWater Spatiotemporal Data Analysis Tool)[Slides]
Pierre RoudierVisualisation and modelling of soil data using the aqp package[Slides]
Biostatistical Modelling, B3.03, Chair: Holger Hoefling
Annamaria GuoloHigher-order likelihood inference in meta-analysis using R[Slides]
Cristiano VarinGaussian copula regression using R[Slides]
Psychometrics, MS.03, Chair: Yves Rosseel
Florian WickelmaierMultinomial Processing Tree Models in R[Slides]
Basil Abou El-KombozDetecting Invariance in Psychometric Models with the psychotree Package[Slides]
Multivariate Data, MS.01, Chair: Peter Dalgaard
John FoxTests for Multivariate Linear Models with the car Package[Slides]
Julie JossemissMDA: a package to handle missing values in and with multivariate exploratory data analysis methods[Slides]
António Pedro Duarte SilvaMAINT.DATA: Modeling and Analysing Interval Data in R[Slides]
Interfaces, MS.02, Chair: Matthew Shotwell
Xavier de Pedro PuenteWeb 2.0 for R scripts and workflows: Tiki and PluginR[Slides]
Sheri GilleyA new task-based GUI for R[Slides]

Thursday 18th August

09:00 – 09:45

Invited Talk, MS.01/MS.02, Chair: Julia Brettschneider
Wolfgang HuberGenomes and phenotypes[Slides] [Video]

09:50 – 10:50

Financial Models, B3.02, Chair: Giovanni Petris
Peter Ruckdeschel(Robust) Online Filtering in Regime Switching Models and Application to Investment Strategies for Asset Allocation[Slides]
Ecology and Ecological Modelling, B3.03, Chair: Karline Soetaert
Christian KampichlerUsing R for the Analysis of Bird Demography on a Europe-wide Scale[Slides]
John C. NashAn effort to improve nonlinear modeling practice[Slides]
Generalized Linear Models, MS.03, Chair: Kenneth Knoblauch
Ioannis Kosmidisbrglm: Bias reduction in generalized linear models[Slides]
Merete K. HansenThe binomTools package: Performing model diagnostics on binomial regression models[Slides]
Reporting Data, MS.01, Chair: Martyn Plummer
Sina RüegeruniPlot – A package to uniform and customize R graphics[Slides]
Alexander KowariksparkTable: Generating Graphical Tables for Websites and Documents with R[Slides]
Isaac SubiranacompareGroups package, updated and improved[Slides]
Process Optimization, MS.02, Chair: Tobias Verbeke
Emilio LópezSix Sigma Quality Using R: Tools and Training[Slides]
Thomas RothProcess Performance and Capability Statistics for Non-Normal Distributions in R[Slides]

11:15 – 12:35

Inference, B3.02, Chair: Peter Ruckdeschel
Henry DengDensity Estimation Packages in R[Slides]
Population Genetics and Genetics Association Studies, B3.03, Chair: Martin Morgan
Benjamin FrenchSimple haplotype analyses in R[Slides]
Neuroscience, MS.03, Chair: Brandon Whitcher
Karsten TabelowStatistical Parametric Maps for Functional MRI Experiments in R: The Package fmri[Slides]
Data Management, MS.01, Chair: Barry Rowlingson
Susan RanneyIt’s a Boy! An Analysis of Tens of Millions of Birth Records Using R[Slides]
Joanne DemmlerChallenges of working with a large database of routinely collected health data: Combining SQL and R[Slides]
Interactive Graphics in R, MS.02, Chair: Paul Murrell
Richard CottonEasy Interactive ggplots[Slides]

14:00 – 15:00

Kaleidoscope IIIa, MS.03, Chair: Adrian Bowman
Thomas PetzoldtUsing R for systems understanding – a dynamic approach[Slides]
David L. MillerUsing multidimensional scaling with Duchon splines for reliable finite area smoothing[Slides]
Alastair SandersonStudying galaxies in the nearby Universe, using R and ggplot2[Slides]
Kaleidoscope IIIb, MS.02, Chair: Frank Harrell
Paul MurrellVector Image Processing[Slides]

 

The present and future of the R blogosphere (~7 minute video from useR2011)

This is (roughly) the lightning talk I gave in useR2011. If you are a reader of R-bloggers.com then this talk is not likely to tell you anything new. However, if you have a friend, college or student who is a new useRs of R, this talk will offer him a decent introduction to what the R blogosphere is all about.

The talk is a call for people of the R community to participate more in reading, writing and interacting with blogs.

I was encouraged to record this talk per the request of Chel Hee Lee, so it may be used in the recent useR conference in Korea (2011)

The talk (briefly) goes through:

  1. The widespread influence of the R blogosphere
  2. What R bloggers write about
  3. How to encourage a blogger you enjoy reading to keep writing
  4. How to start your own R blog (just go to wordpress.com)
  5. Basic tips about writing a blog
  6. One advice about marketing your R blog (add it to R-bloggers.com)
  7. And two thoughts about the future of R blogging (more bloggers and readers, and more interactive online visualization)

My apologies for any of the glitches in my English. For more talks about R, you can visit the R user groups blog. I hope more speakers from useR 2011 will consider uploading their talks online.

Call for proposals for writing a book about R (via Chapman & Hall/CRC)

Rob Calver wrote an interesting invitation on the R mailing list today, inviting potential authors to submit their vision of the next great book about R. The announcement originated from the Chapman & Hall/CRC publishing houses, backed up by an impressive team of R celebrities, chosen as the editors of this new R books series, including:

Bellow is the complete announcement:
Continue reading

R-bloggers in 2010: Top 14 R posts, site statistics and invitation for sponsors

A year ago (on December 9th 2009), I wrote about founding R-bloggers.com, an (unofficial) online R journal written by bloggers who agreed to contribute their R articles to the site.

In this post I wish to celebrate R-bloggers’ first birthday by sharing with you:

  1. Links to the top 14 posts of 2010
  2. Reflections about the origin of R-bloggers
  3. Statistics on “how well” R-bloggers did this year
  4. Links to other related projects
  5. An invitation for sponsors/supporters to help keep the site alive

Continue reading

A competition to recommend “relevant” R packages – and the future of R

Update: the competition was just launched.
* * *

What is the competition about?

Drew Conway and John Myles Whyte have collected data from (52) R users about the packages they have installed. The data is now available on github for download and the contest will be run on the kaggle platform.

For more details, head over to dataists.

And for fun, here is the dependency graph for R packages they have assembled so far:

A graphical visualization of packages’ “suggestion” relationships. Affectionately referred to as the R Flying Spaghetti Monster. More info below.

A tiny bit more on R bloggers virality

Continue reading

R syntax highlighting for bloggers on WordPress.com

Good news for R bloggers who are using WordPress.com to host their blog.

This week, the good people running WordPress.com (special thanks goes to Yoav Farhi), have added the ability for all the users of the WordPress.com platform to be able to highlight their R code inside posts.

Basically you’ll need to wrap the code in your post like this:

[sourcecode language="r"]
test.function = function(r) {
    return(pi * r^2)
}
test.function(1)
[/sourcecode]

(Which will then look like this:
r syntax highlighted code example
)

Further details (and other supported languages) can be read about on this WordPress.com support page.

This new feature was possible thanks to the work of Yihui Xie (who create the famous cool animation package for R), who created a R syntax brush for the syntaxhighlighter WordPress plugin (the plugin used by WordPress.com for sytnax highlighting) . And thanks should also go to Andrew Redd, the creator of NppToR (which connects between notepad++ to R). He both made some good suggestions, and was game to take on the brush creation in case there would be problems, which thankfully so far there aren’t any)

p.s: If you are a WordPress.org users (e.g: have a self hosted WordPress blog) and want to enable R syntax highlighting for your blog, I would recommend the use of the WP-Syntax plugin (enhanced with GeSHi version 1.0.8.6) which can be downloaded here.

Open source and money – why paying R developers might not always help the project

This post can be summed up by one two sentences: We can’t buy love.” “Starting to pay for love could make it disappear” while at the same time “We need money to live and love”. These two conflicting forces, with relation to open source, are the topic of this post.

This post is directed to the community of R users but is relevant to people of all open source projects. It deals with the question of open source projects and funding. Specifically, should a community of open source developers and users, once it exists, want to start raising/donating money to the main code contributers?

The conflict arises when, on the one side, we intuitively wish to repay the people who have helped us but worry of the implications of behavioral studies that suggests that doing so might destroy the motivation of the developers to continue working without contently getting payed, and that making the shift from doing something for one reason (whatever it is) to doing it for money, might not easily be turned back.
On the other side, developers needs to make a (good) living, and we (as a community) should strive for them to be well payed.
How can these two be reconciled?

This article won’t offer a decisive conclusions – and my hope is to invite discussion on the matter (from both amatures and professionals in the field of open source and behavioral economics) so to give more ideas for people to base their opinions on.

Update: this post was substantially updated from it’s original version, thanks to responses both in the comments, and especially in the e-mails. I apologies for writing a post that had needed so many corrections, and at the same time I am grateful for all the people who took the time to shed light in places where I was wrong.

* * * *

Motivation: R has issues – how do we get them fixed?

In the past two weeks there has been a raging debate regarding the future of R (hint: “what is R“). Without going deeper into the topic (I already wrote about it here, where you too can go and respond), I’ll sum up the issue with a quote from Ross Ihaka (one of the two founders of R) who recently wrote:

I’ve been worried for some time that R isn’t going to provide the base that we’re going to need for statistical computation in the future. (It may well be that the future is already upon us.) There are certainly efficiency problems (speed and memory use), but there are more fundamental issues too. Some of these were inherited from S and some are peculiar to R.

After this, several discussion threads where started around the web (for example: 0, 1, 2, 3, 4 ,5, 6 ), but then a comment was made in the R-help mailing list by Jaroslaw Piskorski who wrote:

A few days ago Tal Galili posted a message about some controversies concerning the future of R. Having read the discussions, especially those following Ross Ihaka’s post, I have come to the conclusion, that, as usual, the problem is money. I doubt there would be discussions about dropping R in its present form if the R-Foundation were properly funded and could hire computer scientists, programmers and statisticians. If a commercial company is able to provide big-database and multicore solutions, then so would a properly founded R-Foundation.

To which my response is that: I strongly disagree with this statement..
That is, I do agree that money could help with things. It could be that money could be a part of the solution. But I doubt that the core of this problem is money. Nor that it would be solved if we could only now hire “computer scientists, programmers and statisticians” (although that could be part of the solution).

And the reason I am doubtful stems from two sources:

Continue reading