“Why do people contribute to the R?” – concolusions from a new PNAS article

tl;dr: People contribute to R for various reasons, which evolves with time. The main reasons appear to be: “fun coding”, personal commitment to the community, interaction with like-minded and/or important people  – leading to higher self-esteem, future job opportunities, a chance to express oneself and enjoyable social inclusion.

From the abstract

One of the cornerstones of the R system for statistical computing is the multitude of packages contributed by numerous package authors. This amount of packages makes an extremely broad range of statistical techniques and other quantitative methods freely available. Thus far, no empirical study has investigated psychological factors that drive authors to participate in the R project. This article presents a study of R package authors, collecting data on different types of participation (number of packages, participation in mailing lists, participation in conferences), three psychological scales (types of motivation, psychological values, and work design characteristics), and various socio-demographic factors. The data are analyzed using item response models and subsequent generalized linear models, showing that the most important determinants for participation are a hybrid form of motivation and the social characteristics of the work design. Other factors are found to have less impact or influence only specific aspects of participation.

Summary of results

R developers, statisticians, and psychologists from Harvard University, University of Vienna, WU Vienna University of Economics, and University of Innsbruck empirically studied psychosocial drivers of participation of R package authors. Through an online survey they collected data from 1,448 package authors. The questionnaire included psychometric scales (types of motivation, psychological values, work design), sociodemografic variables related to the work on R, and three participation measures (number of packages, participation in mailing lists, participation in conferences).

cranpnas-dia

The data were analyzed using item response models and subsequently generalized linear models (logistic regressions, negative-binomial regression) with SIMEX corrected parameters.

The analysis reveals that the most important determinants for participation are a hybrid form of motivation and the social characteristics of the work design. Hybrid motivation acknowledges that motivation is a complex continuum of intrinsic, extrinsic, and internalized extrinsic motives.
Motives evolve over time, as task characteristics shift from need-driven problem solving to mundane maintenance tasks within the R community.
For instance, motivation can evolve from pure “fun coding” towards a personal commitment with associated higher responsibilities within the community. The community itself provides a social work environment with high degrees of interaction, two facets of which are strong motivators. First, interaction with persons perceived as important increases one’s own reputation (self-esteem, future job opportunities, etc.) Second, interaction with alike minded persons (i.e., interested in solving statistical problems) creates opportunities to express oneself and enjoy social inclusion.

The findings do not substantiate the commonly held perception that people develop packages out of purely altruistic motives. It is also notable that in most cases package development is undertaken as part of an individual’s research, which is paid by an (academic) institution, rather than uncompensated developments that cut into leisure time.

Full paper (behind PNAS’s paywall for now) is available here:

Mair, P., Hofmann, E., Gruber, K., Hatzinger, R., Zeileis, A., and Hornik, K. (2015). Motivation, values, and work design as drivers of participation in the R
open source project for statistical computing. Proceedings of the National Academy of Sciences of the United States of America, 112(48), 14788-14792

 

2 thoughts on ““Why do people contribute to the R?” – concolusions from a new PNAS article”

  1. The ridiculous thing about this article is that the authors chose to publish in PNAS, which uses a ‘closed-access’ publication model, which is antithetical to the idea of an open-source, community-driven project such as R… Getting their names associated with a high-impact journal like PNAS was more important to the authors than sharing their findings with the widest audience possible!

    1. Hi Jonny,
      I don’t think it is that bad since:

      PNAS complies with the NIH Public Access Policy and extends access even further. PNAS automatically deposits the final, published version of all its content, regardless of funding, in PubMed Central (PMC) and makes it free at both PMC and PNAS just 6 months after publication.

      see: http://www.pnas.org/site/subscriptions/open-access.xhtml

      And to be fair, I can understand the authors’ preference to get a high impact publication in the expense of 6 months wait.

      Best,
      Tal

Leave a Reply