Election tRends: An interactive US election tracker (using Shiny and Plotly)

Guest post by Jonathan Sidi

Introduction

The US primaries are coming on fast with almost 120 days left until the conventions. After building a shinyapp for the Israeli Elections I decided to update features in the app and tried out plotly in the shiny framework.

As a casual voter, trying to gauge the true temperature of the political landscape from the overwhelming abundance of polling is a heavy task. Polling data is continuously published during the state primaries and the variety of pollsters makes it hard to keep track what is going on. The app self updates using data published publicly by realclearpolitics.com.

The app keeps track of polling trends and delegate count daily for you. You create a personal analysis from the granular level data all the way to distributions using interactive ggplot2 and plotly graphs and check out the general elections polling to peak into the near future.

The app can be accessed through a couple of places. I set up an AWS instance to host the app for realtime use and there is the Github repository that is the maintained home of the app that is meant for the R community that can host shiny locally.

Running the App through Github

(github repo: yonicd/Elections)

#changing locale to run on Windows
if (Sys.info()[1] == "Windows") Sys.setlocale("LC_TIME","C") 
 
#check to see if libraries need to be installed
libs=c("shiny","shinyAce","plotly","ggplot2","rvest","reshape2","zoo","stringr","scales","plyr","dplyr")
x=sapply(libs,function(x)if(!require(x,character.only = T)) install.packages(x));rm(x,libs)
 
#run App
shiny::runGitHub("yonicd/Elections",subdir="USA2016/shiny")
 
#reset to original locale on Windows
if (Sys.info()[1] == "Windows") Sys.setlocale("LC_ALL")

Application Layout:

(see next section for details)

  1. Current Polling
  2. Election Analyis
  3. General Elections
  4. Polling Database

Usage Instructions:

Current Polling

  • The top row depicts the current accumulation of delegates by party and candidate is shown in a step plot, with a horizontal reference line for the threshold needed per party to recieve the nomination. Ther accumulation does not include super delegates since it is uncertain which way they will vote. Currently this dataset is updated offline due to its somewhat static nature and the way the data is posted online forces the use of Selenium drivers. An action button will be added to invoke refreshing of the data by users as needed.
  • The bottom row is a 7 day moving average of all polling results published on the state and national level. The ribbon around the moving average is the moving standard deviation on the same window. This is helpful to pick up any changes in uncertainty regarding how the voting public is percieving the candidates. It can be seen that candidates with lower polling averages and increased variance trend up while the opposite is true with the leading candidates, where voter uncertainty is a bad thing for them.

Snapshot of Overview Plot

Continue reading “Election tRends: An interactive US election tracker (using Shiny and Plotly)”

R 3.2.4 is released

R 3.2.4 (codename “Very Secure Dishes”) was released today. You can get the latest binaries version from here. (or the .tar.gz source code from here). The full list of new features and bug fixes is provided below.

Upgrading to R 3.2.4 on Windows

If you are using Windows you can easily upgrade to the latest version of R using the installr package. Simply run the following code in Rgui:

install.packages("installr") # install 
setInternet2(TRUE)
installr::updateR() # updating R.

Running “updateR()” will detect if there is a new R version available, and if so it will download+install it (etc.). There is also a step by step tutorial (with screenshots) on how to upgrade R on Windows, using the installr package.

I try to keep the installr package updated and useful, so if you have any suggestions or remarks on the package – you are invited to open an issue in the github page.

NEW FEATURES

  • install.packages() and related functions now give a more informative warning when an attempt is made to install a base package.
  • summary(x) now prints with less rounding when x contains infinite values. (Request of PR#16620.)
  • provideDimnames() gets an optional unique argument.
  • shQuote() gains type = "cmd2" for quoting in cmd.exe in Windows. (Response to PR#16636.)
  • The data.frame method of rbind() gains an optional argument stringsAsFactors (instead of only depending on getOption("stringsAsFactors")).
  • smooth(x, *) now also works for long vectors.
  • tools::texi2dvi() has a workaround for problems with the texi2dvi script supplied by texinfo 6.1.

    It extracts more error messages from the LaTeX logs when in emulation mode.

UTILITIES

  • R CMD check will leave a log file ‘build_vignettes.log’ from the re-building of vignettes in the ‘.Rcheck’ directory if there is a problem, and always if environment variable_R_CHECK_ALWAYS_LOG_VIGNETTE_OUTPUT_ is set to a true value.

DEPRECATED AND DEFUNCT

  • Use of SUPPORT_OPENMP from header ‘Rconfig.h’ is deprecated in favour of the standard OpenMP define _OPENMP.

    (This has been the recommendation in the manual for a while now.)

  • The make macro AWK which is long unused by R itself but recorded in file ‘etc/Makeconf’ is deprecated and will be removed in R 3.3.0.
  • The C header file ‘S.h’ is no longer documented: its use should be replaced by ‘R.h’.

BUG FIXES

  • kmeans(x, centers = <1-row>) now works. (PR#16623)
  • Vectorize() now checks for clashes in argument names. (PR#16577)
  • file.copy(overwrite = FALSE) would signal a successful copy when none had taken place. (PR#16576)
  • ngettext() now uses the same default domain as gettext(). (PR#14605)
  • array(.., dimnames = *) now warns about non-list dimnames and, from R 3.3.0, will signal the same error for invalid dimnames as matrix() has always done.
  • addmargins() now adds dimnames for the extended margins in all cases, as always documented.
  • heatmap() evaluated its add.expr argument in the wrong environment. (PR#16583)
  • require() etc now give the correct entry of lib.loc in the warning about an old version of a package masking a newer required one.
  • The internal deparser did not add parentheses when necessary, e.g. before [] or [[]]. (Reported by Lukas Stadler; additional fixes included as well).
  • as.data.frame.vector(*, row.names=*) no longer produces ‘corrupted’ data frames from row names of incorrect length, but rather warns about them. This will become an error.
  • url connections with method = "libcurl" are destroyed properly. (PR#16681)
  • withCallingHandler() now (again) handles warnings even during S4 generic’s argument evaluation. (PR#16111)
  • deparse(..., control = "quoteExpressions") incorrectly quoted empty expressions. (PR#16686)
  • format()ting datetime objects ("POSIX[cl]?t") could segfault or recycle wrongly. (PR#16685)
  • plot.ts(<matrix>, las = 1) now does use las.
  • saveRDS(*, compress = "gzip") now works as documented. (PR#16653)
  • (Windows only) The Rgui front end did not always initialize the console properly, and could cause R to crash. (PR#16998)
  • dummy.coef.lm() now works in more cases, thanks to a proposal by Werner Stahel (PR#16665). In addition, it now works for multivariate linear models ("mlm", manova) thanks to a proposal by Daniel Wollschlaeger.
  • The as.hclust() method for "dendrogram"s failed often when there were ties in the heights.
  • reorder() and midcache.dendrogram() now are non-recursive and hence applicable to somewhat deeply nested dendrograms, thanks to a proposal by Suharto Anggono in PR#16424.
  • cor.test() now calculates very small p values more accurately (affecting the result only in extreme not statistically relevant cases). (PR#16704)
  • smooth(*, do.ends=TRUE) did not always work correctly in R versions between 3.0.0 and 3.2.3.
  • pretty(D) for date-time objects D now also works well if range(D) is (much) smaller than a second. In the case of only one unique value in D, the pretty range now is more symmetric around that value than previously.
    Similarly, pretty(dt) no longer returns a length 5 vector with duplicated entries for Date objects dt which span only a few days.
  • The figures in help pages such as ?points were accidentally damaged, and did not appear in R 3.2.3. (PR#16708)
  • available.packages() sometimes deleted the wrong file when cleaning up temporary files. (PR#16712)
  • The X11() device sometimes froze on Red Hat Enterprise Linux 6. It now waits for MapNotify events instead of Expose events, thanks to Siteshwar Vashisht. (PR#16497)
  • [dpqr]nbinom(*, size=Inf, mu=.) now works as limit case, for ‘dpq’ as the Poisson. (PR#16727)
    pnbinom() no longer loops infinitely in border cases.
  • approxfun(*, method="constant") and hence ecdf() which calls the former now correctly “predict” NaN values as NaN.
  • summary.data.frame() now displays NAs in Date columns in all cases. (PR#16709)

 

logo

It’s not the p-values’ fault – reflections on the recent ASA statement (+relevant R resources)

Joint post by Yoav Benjamini and Tal Galili. The post highlights points raised by Yoav in his official response to the ASA statement (available as on page 4 in the ASA supplemental tab), as well as offers a list of relevant R resources.

Summary

The ASA statement about the misuses of the p-value singles it out. It is just as well relevant to the use of most other statistical methods: context matters, no single statistical measure suffices, specific thresholds should be avoided and reporting should not be done selectively. The latter problem is discussed mainly in relation to omitted inferences. We argue that the selective reporting of inferences problem is serious enough a problem in our current industrialized science even when no omission takes place. Many R tools are available to address it, but they are mainly used in very large problems and are grossly underused in areas where lack of replicability hits hard.

p_valuesSource: xkcd

Continue reading “It’s not the p-values’ fault – reflections on the recent ASA statement (+relevant R resources)”