knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "75%",
  message = FALSE,
  warning = FALSE,
  dpi = 300
)

ropenscipackagesreviewed

This package contains a list of R packages in the rOpenSci registry and metadata on whether they have gone through the rOpenSci software review process (as of 2019-09-19).

Installation

You can install the development version of ropenscipackagesreviewed with:

# install.packages("devtools")
devtools::install_github("sharlagelfand/ropenscipackagesreviewed)

Analysis

The list of packages is available internally, in ropenscipackagesreviewed::ropensci_packages, or in a CSV. The code to get the packages is also available.

If a package had a GitHub issue onboarding URL in the registry, it is considered to have gone through (or be going through) rOpenSci software review. If the issue has the "6/approved" label on it, then I'm considering the review to be "Completed" (regardless of whether the issue is open or closed). If it doesn't, then the review is "In Progress".

library(ropenscipackagesreviewed)

ropensci_packages
library(dplyr)

n_ropensci_packages <- nrow(ropensci_packages)
ropensci_packages_review <- ropensci_packages %>%
  filter(software_review)
ropensci_packages_review_completed <- ropensci_packages_review %>%
  filter(software_review_status == "Completed")
ropensci_packages_review_in_progress <- ropensci_packages_review %>%
  filter(software_review_status == "In Progress")
ropensci_packages_no_review <- ropensci_packages %>%
  filter(!software_review)

Of the r n_ropensci_packages packages in the rOpenSci registry, r scales::percent(nrow(ropensci_packages_review)/n_ropensci_packages) (r nrow(ropensci_packages_review)) have completed review (r nrow(ropensci_packages_review_completed)) or are still in review (r nrow(ropensci_packages_review_in_progress)).

The following shows the number of packages that went through review each year, using the date that the "6/approved" label was added to the issue. Note that in one case (the riem package), the onboarding issue has the "6/approved" label on it, but there is no corresponding event for adding this label. In this case, the issue closed date is used.

library(ggplot2)
library(lubridate)

ropensci_packages_review_completed <- ropensci_packages_review_completed %>%
  mutate(review_year = year(review_completed))

ropensci_packages_review_completed %>%
  count(review_year) %>%
  arrange(review_year) %>%
  ggplot(aes(x = review_year, y = n)) +
  geom_col() +
  geom_text(aes(label = n, vjust = -0.5)) +
  labs(
    x = "Review Year",
    y = "Number of Packages",
    title = "rOpenSci Packages Completed Software Review, by Year"
  ) +
  theme_minimal(14)

We can also look at how long it takes packages to go through review. The following shows the distribution of days from when the issue was created to when the "6/approved" label was added to the issue.

ropensci_packages_review_completed <- ropensci_packages_review_completed %>%
  mutate(issue_created_to_review_completed = as.numeric(review_completed - issue_created) / (60 * 60 * 24))

ggplot(
  ropensci_packages_review_completed,
  aes(x = issue_created_to_review_completed)
) +
  geom_histogram(binwidth = 30) +
  scale_x_continuous(breaks = seq(0, max(ropensci_packages_review_completed[["issue_created_to_review_completed"]]), 90)) +
  labs(
    x = "Number of Days",
    y = "Number of Packages",
    title = "rOpenSci Software Review Time",
    subtitle = "Number of days from GitHub issue created to package approved."
  ) +
  theme_minimal(14)
issue_created_to_review_completed_summary <- summary(ropensci_packages_review_completed[["issue_created_to_review_completed"]])

Of all rOpenSci packages that have gone through software review, 50% complete review within r round(issue_created_to_review_completed_summary[["Median"]], 1) days.

The following shows the median number of days it takes packages to go through review, by review year, along with the number of packages (shown above, but as a reminder!).

library(knitr)

ropensci_packages_review_completed %>%
  group_by(`Review Year` = review_year) %>%
  summarise(
    `Median Days from Issue Created to Review Completed` = round(median(issue_created_to_review_completed), 1),
    `Number of Packages` = n()
  ) %>%
  kable()


sharlagelfand/ropenscipackagesreviewed documentation built on Nov. 5, 2019, 8:51 a.m.