knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 6 )

Since version 0.2 **cranly** includes functions for constructing and working with package dependence tree objects. Specifically, the packages that are requirements for a specified package (i.e. appear in `Depends`

, `Imports`

or `LinkingTo`

) are found, then the requirements for those packages are found, and so on. In essence, a package's dependence tree shows what else needs to be installed with the package in an empty package library with the package, and hence it can be used to
+ remove unnecessary dependencies that "drag" with them all sorts of other packages
+ identify packages that are heavy for the CRAN mirrors
+ produced some neat visuals for the package

`cranly_dependence_tree`

objectsConstructing `cranly_dependence_tree`

objects is straightforward once a package directives network has been derived.

Let's attach **cranly**

library("cranly")

and use an instance of the package directives network

package_network <- readRDS(url("https://raw.githubusercontent.com/ikosmidis/cranly/develop/inst/extdata/package_network.rds"))

from CRAN's state on `r format(attr(package_network, "timestamp"), usetz = TRUE)`

.

Alternatively, today's package directives network can be constructed by doing

cran_db <- clean_CRAN_db() package_network <- build_network(cran_db)

We can compute dependence trees for any package in CRAN using the function `compute_dependence_tree`

on the package directives network. For example the dependence tree of **brglm2** is

compute_dependence_tree(package_network, "brglm2")

and of **tibble** is

compute_dependence_tree(package_network, "tibble")

The resulting data frame, includes package names and a generation index. The generation of the named package is by default 0 and as we move back through the required packages and the requirements of those the generation index decreases by 1. I had **loads of fun** implementing `compute_dependence_tree`

, because the tree construction can be neatly and cleanly written as a recursion (see source code of `compute_dependence_tree`

), leveraging the advantages of functional programming (that's a different and long discussion, though).

The method `build_dependence_tree`

uses `compute_dependence_tree`

to construct and edge list for the dependence tree, that we can the visualize. For example for **tibble**

tibble_tree <- build_dependence_tree(package_network, "tibble") plot(tibble_tree)

The *package dependence index* is a rough measure of how much "baggage" an R package carries. The package dependence index is defined as the weighted average that averages across the generation index of the packages in the tree, with weights that are inversely proportional to the popularity of each package in terms of how many other packages depend on, link to or import it. Mathematically, the package dependence index is defined as
$$
-\frac{\sum_{i \in C_p; i \ne p} \frac{1}{N_i} g_i}{\sum_{i \in C_p; i \ne p} \frac{1}{N_i}}
$$
where $C_p$ is the dependence tree for the package(s) $p$,
$N_i$ is the total number of packages that depend, link or
import package $i$, and $g_i$ is the generation that
package $i$ appears in the dependence tree of package(s)
$p$. The generation takes values on the non-positive integers,
with the package(s) $p$ being placed at generation $0$, the
packages that $p$ links to, depends or imports at generation
$-1$ and so on.

For example, the package dependence index for all packages in the dependence tree of **betareg**

betareg_tree <- build_dependence_tree(package_network, "betareg") betareg_dep_index <- sapply(betareg_tree$nodes$package, function(package) { tree <- build_dependence_tree(package_network, package = package) s <- summary(tree) s$dependence_index }) sort(betareg_dep_index)

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.