knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(pacs) library(withr) library(remotes)
pacs: A set of tools that make life easier for developers and maintainers of R packages.
renv
lock files.Hint0: An Internet connection is required to take full advantage of most features.
Hint1: Almost all calls that requiring an Internet connection are cached (for 30 minutes) by the memoise
package, so the second invocation of the same command (and arguments) is immediate. Restart the R session if you want to clear cached data.
Hint2: Version
variable is mostly a minimal required i.e. max(version1, version2 , ...).
Hint3: When working with many packages, global functions are recommended, which retrieve data for many packages at once. An example will be the usage of pacs::checked_packages()
over pacs::pac_checkpage
(or pacs::pac_checkred
). Another example will be the usage of utils::available.packages
over pacs::pac_last
. Finally, the most important one will be pacs::lib_validate
over pacs::pac_validate
and pacs::pac_checkred
and others.
Hint4: Character string "all" is shorthand for the c("Depends", "Imports", "LinkingTo", "Suggests", "Enhances")
vector, character string "most" for the same vector without "Enhances", character string "strong" (default setup) for the first three elements of that vector.
Hint5: Use parallel::mclapply
(Linux and Mac) or parallel::parLapply
(Windows, Linux and Mac) to speed up loop calculations. Nevertheless, under parallel::mclapply
, computation results are NOT cached with memoise
package. Warning: Parallel computations might be unstable.
Hint6: withr
and remotes
packages are a valuable addition.
pacs::lib_validate()
This procedure will be crucial for R developers, clearly showing the possible broken packages inside the local library.
We could assess which packages require versions to update.
Default validation of the library with the pacs::lib_validate
function.
The field
argument is equal to c("Depends", "Imports", "LinkingTo")
by default as these are the dependencies installed when the install.packages
function is used.
The full library validation requires activation of two additional arguments, lifeduration
and checkred
. Additional arguments are by default turned off as they are time-consuming, for lifeduration
assessment might take even a few minutes for bigger libraries.
Assessment of status on CRAN check pages takes only a few additional seconds, even for all R CRAN packages. pacs::checked_packages()
is used to gather all package check statuses for all CRAN servers.
pacs::lib_validate(checkred = list(scope = c("ERROR", "FAIL")))
When lifeduration
is triggered then assessment might take even few minutes.
pacs::lib_validate(lifeduration = TRUE, checkred = list(scope = c("ERROR", "FAIL")))
Not only scope
field inside checkred
list could be updated, to remind any of c("ERROR", "FAIL", "WARN", "NOTE")
. We could specify flavors
field inside the checkred
list argument and narrow the tested machines. The full list of CRAN servers (flavors) might be get with pacs::cran_flavors()$Flavor
.
flavs <- pacs::cran_flavors()$Flavor[1:2] pacs::lib_validate(checkred = list(scope = c("ERROR", "FAIL"), flavors = flavs))
Packages are not installed (and should be) or have too low version:
lib <- pacs::lib_validate(checkred = list(scope = c("ERROR", "FAIL"))) # not installed (and should be) or too low version lib[(lib$version_status == -1), ] # not installed (and should be) lib[is.na(lib$Version.have), ] # too low version lib[(!is.na(lib$Version.have)) & (lib$version_status == -1), ]
Packages which have at least one CRAN server which ERROR or FAIL:
red <- lib[(!is.na(lib$checkred)) & (lib$checkred == TRUE), ] nrow(red) head(red)
Packages that are not a dependency (default c("Depends", "Imports", "LinkingTo")
) of any other package:
lib[is.na(lib$Version.expected.min), ]
Non-CRAN packages:
lib[lib$cran == FALSE, ]
Not newest packages:
lib[(!is.na(lib$newest)) & (lib$newest == FALSE), ]
The core idea behind the function comes from proper processing of the installed.packages
function result.
# aggregate function is needed as we could have different versions # installed under different `.libPaths()`. installed_packages_unique <- stats::aggregate( installed.packages()[, c("Version", "Depends", "Imports", "LinkingTo")], list(Package = installed.packages()[, "Package"]), function(x) x[1] ) # installed_descriptions function transforms direct dependencies DESCRIPTION file fields # installed_packages_unique[, c("Depends", "Imports", "LinkingTo")] # to the two column data.frame with Package name # and minimum required Version i.e. max(version1, version2 , ...). installed_descriptions <- pacs:::installed_descriptions( lib.loc = .libPaths(), fields = c("Depends", "Imports", "LinkingTo") ) merge( installed_descriptions, installed_packages_unique[, c("Package", "Version")], by = "Package", all = TRUE, suffix = c(".expected.min", ".have") )
When a project is based on renv
and all needed dependencies are installed in the renv
directory, we mostly want to validate only the isolated renv
library.
In the new renv
versions the .libPaths()
contains the main library path too (renv
library and the main library).
Please remember to limit the library path when using pacs::lib_validate
, to limit the validation to only renv
library.
# renv::init() pacs::lib_validate(lib.loc = .libPaths()[1])
Warning, at least rsconnect
(and its packrat
connected dependencies) related packages could still not be in the renv
library.
There is a way to validate the renv
lock file in the same way the local library or packages are validated.
# a path or url url <- "https://raw.githubusercontent.com/Polkas/pacs/master/tests/testthat/files/renv_test.lock" pacs::lock_validate(url) pacs::lock_validate( url, checkred = list(scope = c("ERROR", "FAIL"), flavors = NULL) ) pacs::lock_validate( url, lifeduration = TRUE, checkred = list(scope = c("ERROR", "FAIL"), flavors = NULL) )
checked_packages
was built to extend the .packages
family functions, like utils::installed.packages()
and utils::available.packages()
.
pacs::checked_packages
retrieves all current package checks from CRAN webpage.
pacs::checked_packages()
Use pacs::pac_checkpage("dplyr")
to get the check page per package. However, pacs::checked_packages()
will be more efficient for many packages. Remember that pacs::checked_packages()
result is cached after the first invoke.
We could determine if a specific package version lived more than 14 days (or other x limit days). If not then we might assume something needed to be fixed with it, as had to be quickly updated.
e.g. dplyr
under the "0.8.0" version seems to be a broken release, we could find out that it was published only for 1 day.
pacs::pac_lifeduration("dplyr", "0.8.0")
With a 14 day limit we get a proper health status. We are sure about this state as this is not the newest release. For the newest packages we are checking if there are any red messages on CRAN check pages too, specified with a scope
argument.
pacs::pac_health("dplyr", version = "0.8.0", limit = 14)
For the newest package, we will check the CRAN check page too, the scope might be adjusted.
pacs::pac_health("dplyr", limit = 14, scope = c("ERROR", "FAIL", "WARN")) pacs::pac_health("dplyr", limit = 14, scope = c("ERROR", "FAIL", "WARN"), flavors = pacs::cran_flavors()$Flavor[1])
withr
packagewithr
package is recommended for the isolated download process.
We could use a temporary library path (withr::with_temp_libpaths
) to check if the process is as expected.
Checking what packages need to be installed/(optionally updated) parallel with a specific package, with remotes
package. The full list even with packages which are already installed could be get with pacs::pac_deps_user
.
remotes::package_deps("keras") pacs::pac_deps_user("pacs")
Isolated download of a package and the validation.
# restart of R session could be needed withr::with_temp_libpaths({install.packages("keras"); pacs::lib_validate()})
# restart of R session could be needed withr::with_temp_libpaths({install.packages("keras"); pacs::pac_validate("keras")})
Using R CRAN website to get packages version/versions used at a specific Date or a Date interval.
pacs::pac_timemachine("dplyr") pacs::pac_timemachine("dplyr", version = "0.8.0") pacs::pac_timemachine("dplyr", at = as.Date("2017-02-02")) pacs::pac_timemachine("dplyr", from = as.Date("2017-02-02"), to = as.Date("2018-04-02")) pacs::pac_timemachine("dplyr", at = Sys.Date()) pacs::pac_timemachine("tidyr", from = as.Date("2020-06-01"), to = Sys.Date())
One of the main functionality is to get versions for all package dependencies. Versions might come from installed packages or DESCRIPTION files.
pac_deps
for an extremely fast retrieving of package dependencies,
packages versions might come from installed ones or from DESCRIPTION files (required minimum).
The default setup is to show dependencies recursively, recursive = TRUE
.
# Providing more than tools::package_dependencies and packrat:::recursivePackageVersion # pacs::pac_deps is providing the min required version for each package # Use it to answer what we should have res <- pacs::pac_deps("shiny", description_v = TRUE) res attributes(res)
Packages dependencies with versions from DESCRIPTION files.
pacs::pac_deps("shiny", description_v = TRUE)
Remote (newest CRAN) package dependencies with versions.
pacs::pac_deps("shiny", local = FALSE)
Raw dependencies from DESCRIPTION file.
The same which is needed by the the install.packages
function. Depends/Imports/LinkingTo DESCRIPTION fields dependencies, recursively.
pacs::pac_deps_user
could be used to check them.
pacs::pac_deps("memoise", description_v = TRUE, recursive = FALSE, local = FALSE) # or pacs::pac_deps_user("memoise")
The field
argument is used to change the scope of exploration.
The field
argument is equal to c("Depends", "Imports", "LinkingTo")
by default as these are the dependencies installed when the install.packages
function is used.
When the field
argument is extended the number of dependencies will grow.
Remember that we are looking for dependencies recursively by default.
At the moment of writing it the first invoke returns 3 dependencies whereas the second over an one thousand. It should be clear that when extending the scope (and recursively) with the "Suggests"
field then the number of dependencies is exploding.
nrow(pacs::pac_deps("memoise", fields = c("Depends", "Imports", "LinkingTo"))) nrow(pacs::pac_deps("memoise", fields = c("Depends", "Imports", "LinkingTo", "Suggests")))
The developer dependencies are the ones needed when e.g. R CMD check
is run.
These are Depends/Imports/LinkingTo/Suggests DESCRIPTION fields dependencies, and for them Depends/Imports/LinkingTo recursively.
pacs::pac_deps_dev
could be used to check them.
Obviously the list is much longer as the one for pacs::pac_deps_user
.
pac_deps_dev("memoise")
For a certain version (archived), might take some time.
pacs::pac_deps_timemachine("dplyr", version = "0.8.1")
Reading raw dcf
DESCRIPTION files scrapped from the github CRAN mirror or if not worked from the CRAN website.
pacs::pac_description("dplyr") pacs::pac_description("dplyr", version = "0.8.0") pacs::pac_description("dplyr", at = as.Date("2019-01-01"))
Reading raw NAMESPACE files scrapped from the github CRAN mirror or if it did not work from the CRAN website.
pacs::pac_namespace("dplyr") pacs::pac_namespace("dplyr", version = "0.8.0") pacs::pac_namespace("dplyr", at = as.Date("2019-01-01"))
Comparing DESCRIPTION file dependencies between local and the newest package. We will get duplicated columns if the local version is the newest one.
pacs::pac_compare_versions("shiny")
Comparing DESCRIPTION file dependencies between package versions.
pacs::pac_compare_versions("shiny", "1.4.0", "1.5.0") pacs::pac_compare_versions("shiny", "1.4.0", "1.6.0") # to newest release pacs::pac_compare_versions("shiny", "1.4.0")
Comparing NAMESPACE between local and the newest package.
pacs::pac_compare_namespace("shiny")
Comparing NAMESPACE between package versions.
pacs::pac_compare_namespace("shiny", "1.0.0", "1.5.0") # e.g. only exports pacs::pac_compare_namespace("shiny", "1.0.0", "1.5.0")$exports # to newest release pacs::pac_compare_namespace("shiny", "1.0.0")
Take into account that packages sizes are appropriate for your local system (Sys.info()
).
Installation with install.packages
and some devtools
functions might result in different packages sizes.
If you do not want to install anything in your current library (.libPaths()
) and still inspect a package size, then a usage of the withr
package is recommended. withr::with_temp_libpaths
is recommended to isolate the download process.
# restart of R session could be needed withr::with_temp_libpaths({install.packages("devtools"); cat(pacs::pac_true_size("devtools") / 10**6, "MB", "\n")})
Installation in your main library.
# if not have install.packages("devtools")
Size of the devtools
package:
cat(pacs::pac_size("devtools") / 10**6, "MB", "\n")
True size of the package as taking into account its dependencies.
At the time of writing it, it is 113MB
for devtools
without base packages (Mac OS arm64
).
cat(pacs::pac_true_size("devtools") / 10**6, "MB", "\n")
A reasonable assumption might be to count only dependencies which are not used by any other package.
Then we could use exclude_joint
argument to limit them.
However hard to assume if your local installation is a reasonable proxy for an average user.
# exclude packages if at least one other package use it too cat(pacs::pac_true_size("devtools", exclude_joint = 1L) / 10**6, "MB", "\n")
We could check out which of the direct dependencies are heaviest ones:
pac_deps_heavy("devtools")
The shiny app dependencies packages are checked.
By default the c("Depends", "Imports", "LinkingTo")
DESCRIPTION files fields are check recursively for each package recognized with the renv::dependencies
function.
The required dependencies have to be installed in the local repository.
pacs::app_deps(system.file("examples/04_mpg", package = "shiny"), description_v = TRUE) pacs::app_deps(system.file("examples/04_mpg", package = "shiny"), description_v = TRUE, local = FALSE)
When we want to check only direct dependencies, recursive
argument has to be set to FALSE
. Then you could use the renv::dependencies
function directly.
pacs::app_deps(system.file("examples/04_mpg", package = "shiny"), recursive = FALSE)
The size of shiny app is a sum of dependencies and the app directory. The app dependencies (packages) are checked recursively.
cat(pacs::app_size(system.file("examples/04_mpg", package = "shiny")) / 10**6, "MB")
Useful functions to get a list of base packages. You might want to exclude them from final results.
pacs::pacs_base() # start up loaded, base packages pacs::pacs_base(startup = TRUE)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.