knitr::opts_chunk$set ( collapse = TRUE, warning = TRUE, message = TRUE, width = 120, comment = "#>", fig.retina = 2, fig.path = "README-" ) options (repos = c ( ropenscireviewtools = "https://ropensci-review-tools.r-universe.dev", CRAN = "https://cloud.r-project.org" )) library (pkgcheck)
This vignette describes how to modify or extend the existing suite of checks
implemented by pkgcheck
. Each of the internal checks is defined in a separate
file in the R
directory of this package with the prefix of check_
(or
checks_
for files which define multiple, related checks). Checks only require
two main functions, the first defining the check itself, and the second
defining summary
and print
methods based on the result of the first
function. The check functions must have a prefix pkgchk_
, and the second
functions defining output methods specifying must have a prefix
output_pkgchk_
. These two kind of function are now described in the following
two sections.
Both of these functions must also accept a single input parameter of a
pkgcheck
object, by convention named checks
. This object is a list of four
main items:
pkg
which summarises data extracted from
pkgstats::pkgstats()
,
and includes essential information on the package being checked.info
which contains information used in checks, including info$git
detailing git repository information, info$pkgstats
containing a summary
of a few statistics generated from
pkgstats::pkgstats()
,
along with statistical comparisons against distributions from all current
CRAN packages, an info$network_file
specifying a local directory to a
vis.js
visualisation of the function call network of
the package, and an info$badges
item containing information from GitHub
workflows and associated badges, where available.checks
which contains a list of all objects returned from all
pkgchk_...()
functions, which are used as input to output_pkgchk_...()
functions.meta
containing a named character vector of versions of the core packages
used in pkgcheck
.pkgcheck
objects generally also include a fifth item, goodpractice
,
containing the results of goodpractice
checks. The checks
item passed
to each pkgchk_...()
function contains all information on the package
,
info
, meta
, and (optionally) goodpractice
items. Checks may use any of
this information, or even add additional information as demonstrated below. The
checks$checks
list represents the output of check functions, and may not be
used in any way within pkgchk_...()
functions.
This is the output of applying Click here to see structure of full
pkgcheck
objectpkgcheck
to a package generated with the
srr
function
srr_stats_pkg_skeleton()
,
with goodpractice = FALSE
to suppress that part of the results.here <- rprojroot::find_root (rprojroot::is_r_package)
check_file <- file.path (here, "vignettes", "checks.Rds")
d <- srr::srr_stats_pkg_skeleton (pkg_name = "dummypkg")
roxygen2::roxygenise (d)
checks <- pkgcheck::pkgcheck (d, goodpractice = FALSE)
saveRDS (checks, check_file)
print (str (readRDS (check_file)))
An example is the
check for whether a package has a citation, defined in
R/check_has_citation.R
:
knitr::read_chunk ("../R/check-has-citation.R") knitr::read_chunk ("../R/check-scrap.R")
This check is particularly simple, because a "CITATION"
file must have
exactly that name, and must be in the inst
sub-directory.
This function returns a simple logical of TRUE
if the expected "CITATION"
file is present, otherwise it returns FALSE
. This function, and all functions
beginning with the prefix pkgchk_
, will be automatically called by the main
pkgcheck()
function, and the value stored in checks$checks$has_citation
.
The name of the item within the checks$checks
list is the name of the
function with the pkgchk_
prefix removed.
A more complicated example is the function to check whether a package contains
files which should not be there -- internally called "scrap" files. The check
function itself, defined in
R/check-scrap.R
,
checks for the presence of files matching an internally-defined list including
files used to locally cache folder thumbnails such as ".DS_Store"
or
"Thumbs.db"
. The function returns a character vector of the names of any
"scrap" files which can be used by the print
method to provide details of
files which should be removed. This illustrates the first general principle of
these check functions; that,
::: {.alert .alert-info} - Any information needed when summarising or printing the check result should be returned from the main check function. :::
A second important principle is that,
::: {.alert .alert-info}
- Check functions should never return NULL
, rather should always return an
empty vector (such as integer(0)
).
:::
The following section considers how these return values from check functions
are converted to summary
and print
output.
All output_pkgchk_...()
functions must also accept the single input parameter
of checks
, in which the checks$checks
sub-list will already have been
populated by calling all pkgchk_...()
functions described in the previous
section. The pkgchk_has_citation()
function will create an entry of
checks$checks$has_citation
which contains the binary flag indicating whether
or not a "CITATION"
file is present. Similarly, the the pkgchk_has_scrap()
function
will create checks$checks$has_scrap
which will contain names of any scrap
files present, and a length-zero vector otherwise.
The output_pkgchk_has_citation()
function then looks like this:
The first lines are common to all output_pkgchk_...()
functions, and define
the generic return object. This object must be a list with the following three
items:
check_pass
as binary flag indicating whether or not a check was passed;summary
containing text used to generate the summary
output; andprint
containing information used to generate the print
output, itself a
list
of the following items:msg_pre
to display at the start of the print
result;object
to be printed, such as a vector of values, or a data.frame
.msg_post
to display at the end of the print
result following the
object
.summary
and print
methods may be suppressed by assigning values of ""
.
The above example of pkgcheck_has_citation
has print = ""
, and so no
information from this check will appear as output of the print
method. The
summary
field is commented-out in the current version, but left to illustrate
here that it has a value that is specified for both TRUE
and FALSE
values
of check_pass
, via an ifelse
statement. The value is determined by the
result of the main pkgchk_has_citation()
call, and is converted into a green
tick if TRUE
, or a red cross if FALSE
.
Checks for which print
information is desired require a non-empty print
item, as in the output_pkgchk_has_scrap()
function:
In this case, both summary
and print
methods are only triggered if
(!out$check_pass)
-- so only if the check fails. The print
method generates
the heading specified in out$print$msg_pre
, with any vector-valued objects
stored in the corresponding obj
list item displayed as formatted lists.
A package with "scrap" files, "a"
and "b"
, would thus have out$print$obj
<- c ("a", "b")
, and when printed would look like this:
cli::cli_alert_danger ("Package contains the following unexpected files:") cli::cli_ul () cli::cli_li (c ("a", "b")) cli::cli_end ()
This formatting is also translated into corresponding markdown and HTML
formatting in the checks_to_markdown()
function.
The design of these pkgchk_
and output_pkgchk_
functions aims to make the
package readily extensible, and we welcome discussions about developing new
checks. The primary criterion for new package-internal checks is that they must
be of very general applicability, in that they should check for a condition
that almost every package should or should not meet.
The package also has a mechanism to easily incorporate more specific, locally-defined checks, as explored in the following section.
The main pkgcheck()
function has an
additional parameter, extra_env
which specifies,
Additional environments from which to collate checks. Other package names may be appended using c, as in c(.GlobalEnv, "mypkg").
This allows specific checks to be defined locally, and run by passing the name
of the environment in which those checks are defined in this parameter. This
section illustrates the process using the bundled "tarball" (that is, .tar.gz
file) of one version of the pkgstats
package included with that
package.
f <- system.file ("extdata", "pkgstats_9.9.tar.gz", package = "pkgstats") path <- pkgstats::extract_tarball (f) checks <- pkgcheck (path) summary (checks)
cli::cli_h1 ("pkgstats 9.9") message ("") s <- c ("- :heavy_check_mark: Package name is available", "- :heavy_multiplication_x: does not have a 'codemeta.json' file.", "- :heavy_multiplication_x: does not have a 'contributing' file.", "- :heavy_check_mark: uses 'roxygen2'.", "- :heavy_check_mark: 'DESCRIPTION' has a URL field.", "- :heavy_check_mark: 'DESCRIPTION' has a BugReports field.", "- :heavy_multiplication_x: Package has no HTML vignettes", "- :heavy_multiplication_x: These functions do not have examples: [pkgstats_from_archive].", "- :heavy_check_mark: Package has continuous integration checks.", "- :heavy_multiplication_x: Package coverage failed", "- :heavy_multiplication_x: R CMD check found 1 error.", "- :heavy_check_mark: R CMD check found no warnings.") for (i in s) { msg <- strsplit (i, "(mark|\\_x):\\s+") [[1]] [2] if (grepl ("heavy_check_mark", i)) { cli::cli_alert_success (msg) } else { cli::cli_alert_danger (msg) } } message ("") cli::cli_alert_info ("Current status:") cli::cli_alert_danger ("This package is not ready to be submitted.")
Let's now presume I have a reputation in the R community for all of my packages
starting with "aa", to ensure they are always listed first. This section
demonstrates how to implement a check that only passes if the first two letters
of the package name are "aa". The first step described above is to define the
check itself via a function prefixed with pkgchk_
. The easiest approach would
be for the pkgcheck_
function to directly check the name, and return a
logical flag indicating whether or not the same starts with "aa". The resultant
summary
and print
methods can, however, only use the information provided
by the initial pkgchk_
function. That means if we want to print the actual
name in the result of either of those functions, to show that it indeed does
not form the desired patter, we need to return that information. The check
function is then simply:
pkgchk_starts_with_aa <- function (checks) { checks$pkg$name }
We then need to define the output functions:
output_pkgchk_starts_with_aa <- function (checks) { out <- list ( check_pass = grepl ("^aa", checks$checks$starts_with_aa, ignore.case = TRUE), summary = "", print = "" ) out$summary <- paste0 ("Package name [", checks$checks$starts_with_aa, "] does ", ifelse (out$check_pass, "", "NOT"), " start with 'aa'") return (out) }
If we simply define those function in the global workspace of our current R
session, calling pkgcheck()
again will automatically detect those checks and
include them in our output:
cli::cli_h1 ("pkgstats 9.9") message ("") s <- c ("- :heavy_check_mark: Package name is available", "- :heavy_multiplication_x: does not have a 'codemeta.json' file.", "- :heavy_multiplication_x: does not have a 'contributing' file.", "- :heavy_check_mark: uses 'roxygen2'.", "- :heavy_check_mark: 'DESCRIPTION' has a URL field.", "- :heavy_check_mark: 'DESCRIPTION' has a BugReports field.", "- :heavy_multiplication_x: Package has no HTML vignettes", "- :heavy_multiplication_x: These functions do not have examples: [pkgstats_from_archive].", "- :heavy_check_mark: Package has continuous integration checks.", "- :heavy_multiplication_x: Package coverage failed", "- :heavy_multiplication_x: Package name [pkgstats] does NOT start with 'aa'", "- :heavy_multiplication_x: R CMD check found 1 error.", "- :heavy_check_mark: R CMD check found no warnings.") for (i in s) { msg <- strsplit (i, "(mark|\\_x):\\s+") [[1]] [2] if (grepl ("heavy_check_mark", i)) { cli::cli_alert_success (msg) } else { cli::cli_alert_danger (msg) } } message ("") cli::cli_alert_info ("Current status:") cli::cli_alert_danger ("This package is not ready to be submitted.")
Customised personal checks can be incorporated by defining them in a local
package, loading that into the workspace, and passing the name of the package
to the extra_env
parameter.
pkgcheck
Checks (for pkgcheck
developers)New checks can be added to this package by creating new files in the /R
directory prefixed with pkgchk_
, and including the two functions described
above (a check and an output function). The check name will then need to be
included in the order_checks()
function in the R/summarise-checks.R
file,
which determines the order of checks in the summary
output. Checks which are
not defined in this ordering, including any defined via extra_env
parameters,
appear after all of the standard checks, and prior to the R CMD check
results which always appear last. This order may only be modified by editing
the list in that function. The order of check results in the print
method is
also hard-coded, defined in the main print.pkgcheck
method.
As explicitly stated in that function, any new checks should also be included
in the print
method just after the first reference to "misc_checks"
, via an additional line:
print_check_screen (x, "<name-of-new-check>", pkg_env)
The print_check_screen()
function will then automatically activate the
print
method of any new checks. This line should be added even if a new check
has no print
method (as in the starts_with_aa
example above), to provide an
explicit record of all internally-defined miscellaneous checks.
Finally, any new checks also need to be included in tests. The test suites run
on generic, mostly empty packages constructed with the
srr::srr_stats_pkg_skeleton()
function,
as in the main test-pkgcheck.R
test
functions.
Additional tests are also performed on the pkgstats
tarball illustrated
above. The default results of any new checks will be automatically tested by
the existing test suite, but it is important to test all potential results. The
test-extra-checks.R
file
is the main location for testing additional tests, with lines in that file
demonstrating how the main results can be readily modified to reflect
alternative outputs of check functions (such as pkgchk_has_scrap
and
pkgchk_obsolete_pkg_deps
). The output functions defined as part of checks,
including any new checks, do not need to be explicitly tested, as the entire
output is tested via testthat
snapshots. Snapshot
results need to be updated to reflect any additional tests. Finally, the
test-list-checks.R
file
tests the total number of internally-defined checks as expect_length (ncks,
..)
. The number tested there also needs to be incremented by one for each new
check.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.