In clement-lee/rackage: Network Analysis of Dependencies of CRAN Packages

This vignette provides an introduction to the functions facilitating the analysis of the dependencies of CRAN packages, specifically get_dep() and df_to_graph().

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(crandep)
library(dplyr)
library(igraph)

One or multiple types of dependencies

To obtain the information about various kinds of dependencies of a package, we can use the function get_dep() which takes the package name and the type of dependencies as the first and second arguments, respectively. Currently, the second argument accepts a character vector of one or more of the following words: Depends, Imports, LinkingTo, Suggests, Enhances, or any variations in their letter cases, or if LinkingTo is written as Linking_To or Linking To.

get_dep("dplyr", "Imports")
get_dep("MASS", c("depends", "suggests"))

For more information on different types of dependencies, see the official guidelines and https://r-pkgs.org/description.html.

In the output, the column type is the type of the dependency converted to lower case. Also, LinkingTo is now converted to linking to for consistency.

get_dep("xts", "LinkingTo")
get_dep("xts", "linking to")

For the reverse dependencies, instead of including the prefix "Reverse " in type, we use the argument reverse:

get_dep("abc", c("depends", "depends"), reverse = TRUE)
get_dep("xts", c("linking to", "linking to"), reverse = TRUE)

Theoretically, for each forward dependency

data.frame(from = "A", to = "B", type = "c", reverse = FALSE)

there should be an equivalent reverse dependency

data.frame(from = "B", to = "A", type = "c", reverse = TRUE)

Aligning the type in the forward and reverse dependencies enables this to be checked easily.

To obtain all types of dependencies, we can use "all" in the second argument, instead of typing a character vector of all 5 words:

df0.rstan <- get_dep("rstan", "all")
dplyr::count(df0.rstan, type)
df1.rstan <- get_dep("rstan", "all", reverse = TRUE) # too many rows to display
dplyr::count(df1.rstan, type) # hence the summary using count()

Building and visualising a dependency network

To build a dependency network, we have to obtain the dependencies for multiple packages. For illustration, we choose the core packages of the tidyverse, and find out what each package Imports. We put all the dependencies into one data frame, in which the package in the from column imports the package in the to column. This is essentially the edge list of the dependency network.

df0.imports <- rbind(
  get_dep("ggplot2", "Imports"),
  get_dep("dplyr", "Imports"),
  get_dep("tidyr", "Imports"),
  get_dep("readr", "Imports"),
  get_dep("purrr", "Imports"),
  get_dep("tibble", "Imports"),
  get_dep("stringr", "Imports"),
  get_dep("forcats", "Imports")
)
head(df0.imports)
tail(df0.imports)

With the help of the 'igraph' package, we can use this data frame to build a graph object that represents the dependency network.

g0.imports <- igraph::graph_from_data_frame(df0.imports)
set.seed(1457L)
old.par <- par(mar = rep(0.0, 4))
plot(g0.imports, vertex.label.cex = 1.5)
par(old.par)

The nature of a dependency network makes it a directed acyclic graph (DAG). We can use the 'igraph' function is_dag() to check.

igraph::is_dag(g0.imports)

Note that this applies to Imports (and Depends) only due to their nature. This acyclic nature does not apply to a network of, for example, Suggests.

Boundary and giant component

It is possible to set a boundary on the nodes to which the edges are directed, using the function df_to_graph(). The second argument takes in a data frame that contains the list of such nodes in the column name.

df0.nodes <-
  data.frame(
    name = c("ggplot2", "dplyr", "tidyr", "readr", "purrr", "tibble", "stringr", "forcats"),
    stringsAsFactors = FALSE
  )
g0.core <- df_to_graph(df0.imports, df0.nodes)
set.seed(259L)
old.par <- par(mar = rep(0.0, 4))
plot(g0.core, vertex.label.cex = 1.5)
par(old.par)

Going forward

In this other vignette, we show how to obtain the dependency network of all CRAN packages using other functions in the package. The number of reverse dependencies can then be modelled.

clement-lee/rackage documentation built on July 3, 2025, 7:30 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com