Tag Archives: statistics

Siegel-Tukey: a Non-parametric test for equality in variability (R code)

Daniel Malter just shared on the R mailing list (link to the thread) his code for performing the Siegel-Tukey (Nonparametric) test for equality in variability.
Excited about the find, I contacted Daniel asking if I could republish his code here, and he kindly replied “yes”.
From here on I copy his note at full.

The R function can be downloaded from here
Corrections and remarks can be added in the comments bellow, or on the github code page.

* * * *
Continue reading Siegel-Tukey: a Non-parametric test for equality in variability (R code)

Statistics plugins for WordPress

Today I came across a post named “24 Noble WordPress Plugins To Determine The Performance of your Blog” through Weblog Tools Collection (one of my favorite places to stay updates on wordpress). The post provided a good solid list of statistics plugins for wordpress. Some of them are too old to count (pun intended), others are much more recent and relevant.

As a statistics (and WordPress) lover myself, I was inspired to extend the list of wordpress statistics plugins for the hope of benefiting the community:
Blog Metrics
This plugin is based on ideas in an excellent post by Avinash Kaushik (Whom I consider a Web analytics guru and a brilliant blogger!).

it calculates:

  • Raw Author Contribution:
    • average number of posts per month
    • average number of words per post
  • Conversation Rate:
    • average number of comments per postwithout your own comments
    • average number of words used in comments to posts

Both for all the time you’ve been blogging, and for the last month, it then adds these values in a page on your WordPress dashboard.

Blog Metrics for a single author blogBlog metrics per author

Search Meter

This plugin is a must for any blogger. Period.

If you have a Search box on your blog, Search Meter automatically records what people are searching for — and whether they are finding what they are looking for. Search Meter’s admin interface shows you what people have been searching for in the last couple of days, and in the last week or month. It also shows you which searches have been unsuccessful. If people search your blog and get no results, they’ll probably go elsewhere. With Search Meter, you’ll be able to find out what people are searching for, and give them what they want by creating new posts on those topics.  […]

Google analytics Dashboard

Google Analytics Dashboard gives you the ability to view your Google Analytics data in your WordPress dashboard. You can also alow other users to see the same dashboard information when they are logged in or embed parts of the data into posts or as part of your theme.

The biggest advantage of this plugin in my view is that it adds sparklines in the “posts -> edit” page in the admin area.

Analytics360
I don’t use this one much. But one feature it has that I find interesting is that is adds information of when you posted something with the trend line of the google analytics traffic data. It also mixes data from MailChimp’s, which I don’t use.

MailChimp’s Analytics360 plugin allows you to pull Google Analytics and MailChimp data directly into your dashboard, so you can access robust analytics tools without leaving WordPress.

Broken Link Checker
This plugin is also a must.

This plugin will monitor your blog looking for broken links and let you know if any are found.

  • Monitors links in your posts, pages, the blogroll, and custom fields (optional).
  • Detects links that don’t work and missing images.
  • Notifies you on the Dashboard if any are found.
  • Also detects redirected links.
  • Makes broken links display differently in posts (optional).
  • Link checking intervals can be configured.
  • New/modified posts are checked ASAP.
  • You view broken links, redirects, and a complete list of links used on your site, in the Tools -> Broken Links tab.
  • Searching and filtering links by URL, anchor text and so on is also possible.
  • Each link can be edited or unlinked directly via the plugin’s page, without manually editing each post.

Piwik + WP-Piwik

This plugin adds a Piwik stats site to your WordPress dashboard. It’s also able to add the Piwik tracking code to your blog.
Piwik is an open source (GPL licensed) web analytics software program. It provides you with detailed real time reports on your website visitors: the search engines and keywords they used, the language they speak, your popular pages and so on…

You can install Piwik more or less like you install WordPress, and then you are left to integrate it into your blog. The only real down side of it for me (compared to google analytics) is the advanced segmentation and pivoting. But in general it is a free, great (and growing!) Web analytics solution.

Woopra Analytics Plugin
I have been using Woopra since their release thanks to lorelle. I enjoy the ability to follow the live actions that are happening inside the blog. Although since woopra went from BETA to GOLD, I lost most interest because the total blogs I track have more traffic volume then woopra allow tracking in their free account. But small bloggers could find the service gratifying.

Woopra is the world’s most comprehensive, information rich, easy to use, real-time Web tracking and analysis application.

Features include:

  • Live Tracking and Web Statistics
  • A rich user interface and client monitoring application
  • Real-time Analytics
  • Manage Multiple Blogs and Websites
  • Deep analytic and search capabilities
  • Click-to-chat
  • Visitor and member tagging
  • Real-time notifications
  • Easy Installation and Update Notification

Final notes

If you are into web analytics, I also encourage you to give the following a try: Nuconomy,ClickTale, Crazy Egg. And of course, Google analytics. Each of them (and also Woopra) strips you and your visitors a bit more from their privacy. But that is the ultimate price we pay for the strong Web analytics solutions that exists out there.
If you got any more statistics plugins I missed, feel encouraged to share them with me in the comments :)

Free statistics e-books for download

This post will eventually grow to hold a wide list of books on statistics (e-books, pdf books and so on) that are available for free download.  But for now we’ll start off with just one several books:

Several of these books were discovered through a CrossValidated discussion.

* * *

Know of any more e-books freely available for download? Please write to us about them in the comments.

What is R?

Highlights

  • R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.   If you wish to download R, please choose your preferred CRAN mirror.
  • The R language has become a de facto standard among statisticians for the development of statistical software,and is widely used for statistical software development and data analysis.
  • Basic questions about R like how to download and install the software, or what the license terms are, are answered in the answers to frequently asked questions section.

Introduction to R

R is a language and environment for statistical computing and graphics.  R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible.

One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.

R is available as Free Software under the terms of the Free Software Foundation‘s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.

R and S

R is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues.  R can be considered as a different implementation of S.  There are some important differences, but much code written for S runs unaltered under R. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.

The R environment

R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes

  • an effective data handling and storage facility,
  • a suite of operators for calculations on arrays, in particular matrices,
  • a large, coherent, integrated collection of intermediate tools for data analysis,
  • graphical facilities for data analysis and display either on-screen or on hardcopy, and
  • a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.

The term “environment” is intended to characterize it as a fully planned and coherent system, rather than an incremental accretion of very specific and inflexible tools, as is frequently the case with other data analysis software.

R, like S, is designed around a true computer language, and it allows users to add additional functionality by defining new functions. Much of the system is itself written in the R dialect of S, which makes it easy for users to follow the algorithmic choices made. For computationally-intensive tasks, C, C++ and Fortran code can be linked and called at run time. Advanced users can write C code to manipulate R objects directly.

Many users think of R as a statistics system. We prefer to think of it of an environment within which statistical techniques are implemented.  R can be extended (easily) via packages. There are about eight packages supplied with the R distribution and many more are available through the CRAN family of Internet sites covering a very wide range of modern statistics.

R has its own LaTeX-like documentation format, which is used to supply comprehensive documentation, both on-line in a number of formats and in hardcopy.

(credit: the R about page and the Wikipedia article R (programming language))