Quickstart

suppressPackageStartupMessages(library("dplyr"))          # for tidy data manipulations
suppressPackageStartupMessages(library("magrittr"))       # for friendly piping
suppressPackageStartupMessages(library("network"))        # for plotting
suppressPackageStartupMessages(library("sna"))            # for plotting
suppressPackageStartupMessages(library("statnet.common")) # for plotting
suppressPackageStartupMessages(library("networkD3"))      # for plotting
suppressPackageStartupMessages(library("igraph"))         # for graph computations
suppressPackageStartupMessages(library("pkggraph"))       # attach the package
suppressMessages(init(local = TRUE))                      # initiate the package

```{R, eval = TRUE } get_neighborhood("mlr") # a tibble, every row indicates a dependency

observe only 'Imports' and reverse 'Imports'

neighborhood_graph("mlr", relation = "Imports") %>% plot()

observe the neighborhood of 'tidytext' package

get_neighborhood("tidytext") %>% make_neighborhood_graph() %>% plot()

interact with the neighborhood of 'tm' package

legend does not appear in the vignette, but it appears directly

neighborhood_graph("tm") %>% plotd3(700, 700)

which packages work as 'hubs' or 'authorities' in the above graph

neighborhood_graph("tidytext", type = "igraph") %>% extract2(1) %>% authority_score() %>% extract2("vector") %>% tibble(package = names(.), score = .) %>% top_n(10, score) %>% ggplot(aes(reorder(package, score), score)) + geom_bar(stat = "identity") + xlab("package") + ylab("score") + coord_flip()

# Introduction

> The package `pkggraph` aims to provide a consistent and intuitive platform to explore the dependencies of packages in CRAN like repositories.

The package attempts to strike a balance between two aspects:

 - Understanding characteristics of the repository, at repository level (relating to 'forest')
 - Discover relevant packages and their contribution (relating to 'trees')

So that, we do not *see trees for the forest* nor *see only a forest* !

# Important Features

The important features  of `pkggraph` are:

- Most functions return a three column `tibble` (`pkg_1`, `relation`, `pkg_2`). The first row in the table below indicates that `dplyr` package 'Imports' `assertthat` package.

```{R, eval = TRUE}
get_imports(c("dplyr", "tidyr"))

The five different types of dependencies a package can have over another are: Depends, Imports, LinkingTo, Suggests and Enhances.

init

Always, begin with init(). This creates two variables deptable and packmeta in the environment where it is called. The variables are created using local copy or computed after downloading from internet (when local = FALSE, the default value). It is suggested to use init(local = FALSE) to get up to date dependencies.

```{R, eval = FALSE} library("pkggraph") init(local = FALSE)

The `repository` argument takes CRAN, bioconductor and omegahat repositories. For other CRAN-like repositories not listed in `repository`, an additional argument named `repos` is required.

# `get` family

- These functions return a `tibble`
- All of them take `packages` as their first argument.
- All of them take `level` argument (Default value is 1).

```{R, eval = TRUE}
get_imports("ggplot2")

Lets observe packages that 'Suggest' knitr.

```{R, eval = TRUE} get_reverse_suggests("knitr", level = 1)

By setting `level = 2`, observe that packages from first level (first column of the previous table) and their suggestors are captured.

```{R}
get_reverse_suggests("knitr", level = 2)

What if we required to capture dependencies of more than one type, say both Depends and Imports?

get_all_dependencies and get_all_reverse_dependencies

These functions capture direct and reverse dependencies until the suggested level for any subset of dependency type.

get_all_dependencies("mlr", relation = c("Depends", "Imports"))
get_all_dependencies("mlr", relation = c("Depends", "Imports"), level = 2)

Observe that ada 'Depends' on rpart.

Sometimes, we would like to capture only specified dependencies recursively. In this case, at second level, say we would like to capture only 'Depends' and 'Imports' of packages which were dependents/imports of mlr. Then, set strict = TRUE.

```{R, eval = TRUE} get_all_dependencies("mlr" , relation = c("Depends", "Imports") , level = 2 , strict = TRUE)

Notice that `ada` was 'Suggest'ed by `mlr`. That is why, it appeared when `strict` was `FALSE`(default).

> What if we required to capture both dependencies and reverse dependencies until a specified level?

## `get_neighborhood`

This function captures both dependencies and reverse dependencies until a specified level for a given subset of dependency type.

```{R, eval = TRUE }
get_neighborhood("hash", level = 2)

get_neighborhood("hash", level = 2) %>% 
  make_neighborhood_graph %>% 
  plot()

Observe that testthat family appears due to Suggests. Lets look at Depends and Imports only:

```{R, eval = TRUE } get_neighborhood("hash" , level = 2 , relation = c("Imports", "Depends") , strict = TRUE) %>% make_neighborhood_graph %>% plot()

Observe that the graph below captures the fact: `parallelMap` 'Imports' `BBmisc`

```{R, eval = TRUE }
get_neighborhood("mlr", relation = "Imports") %>% 
  make_neighborhood_graph() %>% 
  plot()

get_neighborhood looks if any packages until the specified level have a dependency on each other at one level higher. This can be done turned off by setting interconnect = FALSE.

```{R, eval = TRUE } get_neighborhood("mlr", relation = "Imports", interconnect = FALSE) %>% make_neighborhood_graph() %>% plot()

# `neighborhood_graph` and `make_neighborhood_graph`

- `neighborhood_graph` creates a graph object of a set of packages of class `pkggraph`. This takes same arguments as `get_neighborhood` and additionally `type`. Argument `type` defaults to `igraph`. The alternative is `network`.

```{R, eval = TRUE }
neighborhood_graph("caret", relation = "Imports") %>% 
  plot()

make_neighborhood_graph accepts the output of any get_* as input and produces a graph object.

Essentially, you can get the information from get_ function after some trial and error, then create a graph object for further analysis or plotting.

```{R, eval = TRUE} get_all_reverse_dependencies("rpart", relation = "Imports") %>% make_neighborhood_graph() %>% plot()

# Checking dependencies and `relies`

For quick dependency checks, one could use infix operators: `%depends%`, `%imports%`, `%linkingto%`, `%suggests%`, `%enhances%`.

```{R, eval = TRUE}
"dplyr" %imports% "tibble"

A package A is said to rely on package B if A either 'Depends', 'Imports' or 'LinkingTo' B, recursively. relies function captures this.

```{R, eval = TRUE} relies("glmnet")[[1]]

level 1 dependencies of "glmnet" are:

get_all_dependencies("glmnet", relation = c("Imports", "Depends", "LinkingTo"))[[3]] "glmnet" %relies% "grid" reverse_relies("tokenizers")[[1]]

# `plot` and its handles

`plot` produces a static plot from a `pkggraph` object. The available handles are:

- The default: The node size is based on the number of 'in' and 'out' degree.

```{R, eval = TRUE}
pkggraph::neighborhood_graph("hash") %>%
  plot()
- Without variable node size and white 'background':
```{R, eval = TRUE}
pkggraph::neighborhood_graph("hash") %>%
  plot(nodeImportance = "none", background = "white")

plotd3

For interactive exploration of large graphs, plotd3 might be better than static plots. Note that,

```{R, eval = TRUE}

legend does not appear in the vignette, but it appears directly

plotd3(neighborhood_graph("tibble"), height = 1000, width = 1000) ```

Acknowledgement

Package authors Srikanth KS and Nikhil Singh would like to thank R core, Hadley Wickham for tidyverse framework and the fantastic R community!




talegari/pkggraph documentation built on May 6, 2019, 10:50 a.m.