BiocCheck
BiocCheck
encapsulates Bioconductor package guidelines and best
practices, analyzing packages and reporting three categories of
issues:
BiocCheck
will continue past an ERROR
, thus it is
possible to have more than one, but it will exit with an error code
if run from the OS command line.)BiocCheck
BiocCheck
is meant to run within R on a directory containing an R package, or a source
tarball (.tar.gz
file):
library(BiocCheck) BiocCheck("packageDirOrTarball")
BiocCheck
takes options which can be seen by:
suppressPackageStartupMessages(library(BiocCheck)) usage()
Note that the --new-package
option is turned on in the package
builder attached to the Bioconductor package tracker, since this is
almost always used to build new packages that have been submitted.
BiocCheck
be runRun BiocCheck
after running R CMD check
.
Note that BiocCheck
is not a replacement for R CMD check
; it is
complementary. It should be run after R CMD check
completes
successfully.
BiocCheck
can also be run via the Travis-CI
(continuous integration) system. This service allows automatic testing of R
packages in a controlled build environment.
Simply add the following line to your package's .travis.yml
file:
bioc_check: true
BiocCheck
BiocCheck
should be installed as follows:
if (!"BiocManager" %in% rownames(installed.packages())) install.packages("BiocManager") BiocManager::install("BiocCheck")
The package loading process attempts to install a script called
BiocCheck
(BiocCheck.bat
on Windows) into the bin
directory of
your R
installation. If it fails to do that (most likely due to
insufficient permissions), it will tell you, saying something like:
Failed to copy the "script/BiocCheck" script to /Library/Frameworks/R.framework/Resources/bin. If you want to be able to run 'R CMD BiocCheck' you'll need to copy it yourself to a directory on your PATH, making sure it is executable. See the BiocCheck vignette for more information.
You can fix the problem by following these instructions (noting that
R
may live in a different directory on your system than what is
shown above).
If you don't have permission to copy this file to the bin
directory
of your R
installation, you can, as noted, copy it to any directory
that's in your PATH. For assistance modifying your PATH, see this link
(Windows) or this
one (Mac/Unix).
If you manually copy this file to a directory in your PATH that is not
your R bin directory, you'll continue to see the above message when
(re-)installing BiocCheck
but you can safely ignore it.
BiocCheck
outputActual BiocCheck
output is shown below in bold.
Checking Package Dependencies...
Can be disabled with --no-check-dependencies
.
Checking if other packages can import this one...
ERROR
).Checking to see if we understand object initialization....
NOTE
).Checking for deprecated package usage...
Can be disabled with --no-check-deprecated
.
At present, this looks to see whether your package has a dependency on
the multicore
package (ERROR
).
Our recommendation is to use BiocParallel. Note that 'fork' clusters do not rpovide any gain from parallelizing code on Windows. Socket clusters work on all operating systems.
Also checks Deprecated
Packages currently specified in release and devel
versions of Bioconductor (ERROR
).
Checking for remote package usage...
Can be disabled with --no-check-remotes
Bioconductor only allows dependencies that are hosted on CRAN or
Bioconductor. The use of Remotes:
in the DESCRIPTION to specify a unique
remote location is not allowed.
Can be disabled with --no-check-version-num
and --no-check-R-ver
.
Checking version number...
Version:
field in your DESCRIPTION
file. If it doesn't, it usually means
you did not build the tarball with R CMD build
. (ERROR
)99
'y' version in the x.y.z
versioning scheme
(ERROR
). Package versions starting with a non-zero value will
get flagged with a warning. Typical new package submissions
start with a zero 'x' version (e.g., 0.99.*
; WARNING
). This
is only done if the --new-package
option is supplied. An 'x' nonzero will
only be accepted if the package was pre-released or published under such a
case.ERROR
).Depends:
field of your
DESCRIPTION
file, BiocCheck
checks to make sure that the R
version specified matches the version currently used by
Bioconductor. This prevents the package from being used in earlier
versions of R, which is not recommended and is a frequent cause of
user confusion (WARNING
).For more information on package versions, see the Version Numbering HOWTO.
Can be disabled with --no-check-pkg-size
and --no-check-file-size
.
Checking package size Checks that the package size meets Bioconductor requirements. The current package size limit is 5 MB for Software packages. Experiment Data and Annotation packages are excluded from this check. This check is only run if checking a source tarball. (ERROR)
Checking individual file sizes The current size limit for all individual files is 5 MB. (WARNING)
It may be necessary to remove large files from your git history; see Remove Large Data Files and Clean Git Tree
These can be disabled with the --no-check-bioc-views
option, which
might be useful when checking non-Bioconductor packages (since
biocViews is a concept unique to Bioconductor).
Checking biocViews...
Can be disabled with --no-check-bioc-views
biocViews
field is present in the DESCRIPTION file
(ERROR
).ERROR
).WARNING
).WARNING
).recommendBiocViews()
function from biocViews
to
automatically suggest some biocViews for your package.More information about biocViews is available in the Using biocViews HOWTO.
The Bioconductor Build System (BBS) is our nightly build system and it has certain requirements. Packages which don't meet these requirements can be silently skipped by BBS, so it's important to make sure that every package meets the requirements.
Can be disabled with --no-check-bbs
Checking build system compatibility...
ERROR
).ERROR
)WARNING
if less than 50)WARNING
if less than 20)NOTE
if less than 3)ERROR
).Package
field of DESCRIPTION file matches
directory or tarball name (ERROR
).Version
field is present in the DESCRIPTION
file (ERROR
).Authors@R
field which
resolves to a valid Maintainer
(ERROR
).A valid Authors@R
field consists of:
* A valid R object of class person
.
* Only one person with the cre
(creator) role.
* That person must have a syntactically valid email address.
* That person must have either family
or given
name defined.
* (optional) A syntactically valid ORCID ID, results in note if not.
Can be disabled with --no-check-namespace
Checking DESCRIPTION/NAMESPACE consistency...
BiocCheck
detects packages that are imported in NAMESPACE but not
DESCRIPTION, or vice versa, and provides an explanation of how to fix
this (ERROR
).
Checking for namespace import suggestions...
If the package codetoolsBioC
is installed, BiocCheck
will run it
to see if it has suggestions for the "Imports" section of your package
NAMESPACE.
codetoolsBioC
is an experimental package that is not presently
available via BiocManager::install()
. It is available from our
Subversion repository with the credentials readonly/readonly.
Output of codetoolsBioC is printed to the screen but BiocCheck
does
not label it ERROR, WARNING, or NOTE.
Can be disabled with --no-check-vignettes
.
Checking vignette directory...
Only run if your package is a software package (as determined by your biocViews), or if package type cannot be determined.
vignettes
directory exists (ERROR
).vignettes
directory only contains vignette sources
(.Rmd, .Rnw, .Rrst, .Rhtml, *.Rtex) (ERROR
).ERROR
).WARNING
)ERROR
)WARNING
)eval=FALSE
chunks is more than 50% of
the total (WARNING
).eval=FALSE
.
The majority of vignette code is expected to be evaluated (WARNING
)Checking whether vignette is built with 'R CMD build'...
Only run when --build-output-file
is specified.
Analyzes the output of R CMD build
to see if vignettes are built.
It simply looks for a line that starts:
* creating vignettes ...
If this line is not present, it means R
has not detected that a
vignette needs to be built (ERROR
).
If you have vignette sources yet still get this message, there could be several causes:
VignetteBuilder
line in the DESCRIPTION
file.VignetteEngine
line in the vignette source.See knitr
's package vignette page, or the
Non-Sweave vignettes section of "Writing R Extensions" for more
information.
Can be disabled with --no-check-library-calls
and --no-check-install-self
.
NOTE
)
Check for use of functions that install or update packages. This list
currently includes the use of install
, install.packages
, update.packages
or biocLite
.ERROR
)
It is not necessary to call library()
or require()
on your own
package within code in the R directory or in man page examples. In
these contexts, your package is already loaded.Can be disabled with --no-check-coding-practices
.
Checking coding practices...
Checks to see whether certain programming practices are found in the R directory.
We recommend that vapply()
be used instead of sapply()
. Problems
arise when the X
argument to sapply()
has length 0; the return
type is then a list()
rather than a vector or array. (NOTE
)
We recommend that seq_len()
or seq_along()
be used instead of
1:...
. This is because the case 1:0
creates the sequence c(1, 0)
which may be an unexpected or unwanted result (NOTE
).
Single colon typos are checked for when a user inputs 'package:function'
instead of using double colons ('::') to import a function (ERROR
).
Checking for T... Checking for F...
It is bad practice to use T
and F
for TRUE
and FALSE
. This
is because T
and F
are ordinary variables whose value can be
altered, leading to unexpected results, whereas the value of TRUE
and FALSE
cannot be changed (WARNING
).
Avoid class() ==
and class() !=
instead use is()
. (WARNING
)
Use system2()
over system()
. 'system2' is a more portable and
flexible interface than 'system'.(NOTE
)
Use of set.seed()
in R code. The set.seed
should not be set in
R functions directly. The user should always have the option for
the set.seed and know when it is being invoked. (WARNING
)
Checking parsed R code in R directory, examples, vignettes...
BiocCheck
parses the code in your package's R directory, and in
evaluated man page and vignette examples to look for various symbols,
which result in issues of varying severity.
browser()
causes the command-line R debugger to be invoked, and
should not be used in production code (though it's OK to wrap such
calls in a conditional that evaluates to TRUE if some debugging
option is set) (WARNING
).<<-
is bad practice. It can over-write
user-defined symbols, and introduces non-linear paths of evaluation
that are difficult to debug (NOTE
).BiocCheck
checks for direct slot access (via @
or slot()
) to
S4 objects in vignette and example code. This code should always
use accessors to interact with S4 classes. Since you may be using S4
classes (which don't provide accessors) from another package, the
severity is only NOTE
. But if the S4 object is defined in your
package, it's mandatory to write accessors for it and to use
them (instead of direct slot access) in all vignette and example
code (NOTE
).Can be disabled with --no-check-function-len
.
Checking function lengths...
BiocCheck
prints an informative message about the length (in lines)
of your five longest functions (this includes functions in your R
directory and in evaluated man page and vignette examples).
If there are functions longer than 50 lines, BiocCheck
outputs (NOTE
).
You may want to consider breaking up long functions into smaller ones. This is
a basic refactoring technique that results in code that's easier to
read, debug, test, reuse, and maintain.
Can be disabled with --no-check-man-doc
.
Checking man page documentation...
It can be handy to generate man page skeletons with prompt()
and/or
RStudio. These skeletons contain comments that look like this:
%% ~~ A concise (1-5 lines) description of the dataset. ~~
BiocCheck
asks you to remove such comments (NOTE
).
Every man page must have a non-empty \value
section. (ERROR
)
man page examples examples
Checking exported objects have runnable examples...
BiocCheck
looks at all man pages which document exported objects and
lists the ones that don't contain runnable examples (either because
there is no examples
section or because its examples are tagged with
dontrun
or donttest
). Runnable examples are a key part of literate
programming and help ensure that your code does what you say it does.
ERROR
).BiocCheck
lists the missing
ones and asks you to add runnable examples to them (NOTE
).dontrun
or donttest
. Use of these functions is not
recommended and shoud be justified (NOTE
). If exception is made the
recommended usage is to use donttest over dontrun (NOTE
) as donttest
requires valid R code.Can be disabled with --no-check-news
.
Checking package NEWS...
BiocCheck
looks to see if there is a valid NEWS file either in the 'inst'
directory or in the top-level directory of your package, and checks whether it
is properly formatted (NOTE
).
The location and format of the NEWS file must be consistent with
?news
. Meaning the file can be one of the following four options:
inst/NEWS.Rd
./NEWS.md
./NEWS
inst/NEWS
NEWS files are a good way to keep users up-to-date on changes to your
package. Excerpts from properly formatted NEWS files will be included
in Bioconductor release announcements to tell users what has changed
in your package in the last release. In order for this to happen, your
NEWS file must be formatted in a specific way; you may want to
consider using an inst/NEWS.Rd
file instead as the format is more
well-defined. Malformatted NEWS file outputs WARNING
.
More information on NEWS files is available in the help topic ?news
.
Can be disabled with --no-check-unit-tests
.
Checking unit tests...
We strongly recommend unit tests, though we do not at present require them. For more on what unit tests are, why they are helpful, and how to implement them, read our Unit Testing HOWTO.
At present we just check to see whether unit tests are present, and if not,
urge you to add them (NOTE
).
Checking skip_on_bioc() in tests...
Can be disabled with --no-check-skip-bioc-tests
.
Finds flag for skipping tests in the bioconductor environment (NOTE
)
Can be disabled with --no-check-formatting
.
Checking formatting of DESCRIPTION, NAMESPACE, man pages, R source, and vignette source...
There is no 100% correct way to format code. These checks adhere to the
Bioconductor Style Guide (NOTE
).
We think it's important to avoid very long lines in code. Note that some text editors do not wrap text automatically, requiring horizontal scrolling in order to read it. Also note that R syntax is very flexible and whitespace can be inserted almost anywhere in an expression, making it easy to break up long lines.
These checks are run against not just R code, but the DESCRIPTION and NAMESPACE files as well as man pages and vignette source files. All of these files allow long lines to be broken up.
The output of this check includes the first 6 offending lines of code;
see more with BiocCheck:::checkFormatting("path/to/YourPackage",
nlines=Inf)
.
There are several helpful packages that can be used for formatting of
R code to particular coding standards such as formatR and
styler as well as the "Reformat code" button in
RStudio Desktop. Each solution has its advantages, though
styler works with roxygen2
examples and is actively
maintained. You can re-format your code using styler as shown
below:
## Install styler if necessary if (!requireNamespace("styler", quietly = TRUE)) { install.packages("styler") } ## Automatically re-format the R code in your package styler::style_pkg(transformers = styler::tidyverse_style(indent_by = 4))
If you are
working with RStudio Desktop use also the "Reformat code"
button which will help you break long lines of code. Alternatively,
use formatR, though beware that it can break valid R code
involving both types of quotation marks ("
and '
) and does not
support re-formatting roxygen2
examples. In general,
it is best to version control your code before applying any
automatic re-formatting solutions and implement unit tests to
verify that your code runs as intended after you re-format your code.
Checking if package already exists in CRAN...
This can be disabled with the --no-check-CRAN
option. A package with the same
name (case differences are ignored) cannot exist in CRAN (ERROR
).
Checking if new package already exists in Bioconductor...
Only run if the --new-package
flag is turned on. A package
with the same name (case differences are ignored) cannot exist in
Bioconductor (ERROR
).
Checking for bioc-devel mailing list subscription...
This only applies if BiocCheck
is run on the Bioconductor build
machines, because this step requires special authorization. This can be disabled
with the --no-check-bioc-help
option.
ERROR
).All maintainers must subscribe to the bioc-devel mailing list, with the email address used in the DESCRIPTION file. You can subscribe here.
Checking for support site registration...
Maintainer
field of their package DESCRIPTION
file (ERROR
).
This can be disabled with the --no-check-bioc-help
option.The main place people ask questions about Bioconductor packages is the support site. Please register and then optionally include your (lowercased) package name in the list of watched tags. When a question is asked and tagged with your package name, you'll get an email. (If you don't add your package to the list of watched tags, this will be automatically done for you).
BiocCheckGitClone
BiocCheckGitClone
provides a few additional Bioconductor package checks that
can only should be run on a open source directory (raw git clone) NOT a
tarball. Reporting similarly in three categories as discussed above:
ERROR.
WARNING.
NOTE.
BiocCheckGitClone
BiocCheckGitClone
is meant to run within R on a directory containing an R package:
library(BiocCheck) BiocCheckGitClone("packageDir")
BiocCheckGitClone
Please see previous Installing BiocCheck
section.
BiocCheckGitClone
outputActual BiocCheckGitClone
output is shown below in bold.
Checking valid files
There are a number of files that should not be git tracked. This check notifies
if any of these files are present (ERROR
)
The current list of files checked is as follows:
hidden_file_ext = c(".renviron", ".rprofile", ".rproj", ".rproj.user", ".rhistory", ".rapp.history", ".o", ".sl", ".so", ".dylib", ".a", ".dll", ".def", ".ds_store", "unsrturl.bst", ".log", ".aux", ".backups", ".cproject", ".directory", ".dropbox", ".exrc", ".gdb.history", ".gitattributes", ".gitmodules", ".hgtags", ".project", ".seed", ".settings", ".tm_properties")
These files may be included in your personal directories but should be added to
a .gitignore
file so they are not git tracked.
Checking DESCRIPTION
Default R CMD build behavior will format the DESCRIPTION file; After this occurs, it is hard to determine certain aspects of the original DESCRIPTION file. An example would be how the Authors and Maintainers are specified. The DESCRIPTION file is therefore checked in its raw original form.
Checking if DESCRIPTION is well formatted
The DESCRIPTION file must be properly formatted and able to be read in with
read.dcf()
in order to function properly on the Bioconductor build
machines. This check attempts to read.dcf("DESCRIPTION")
and throws an ERROR
if mal-formatted. (ERROR
)
Checking for valid maintainer
While in the past using the Author and Maintainer fields were acceptable, R
has moved towards using the Authors@R
standard for listing package
contributors. This checks that Authors@R is utilized and that there are no
instances of Author or Maintainer in the DESCRIPTION (ERROR
)
Checking that CITATION file is correctly formatted
BiocCheck
tries to read the provided CITATION
file (i.e. not the one
automatically generated by each package) with readCitationFile()
- this is
expected to be in the INST
folder (NOTE
). readCitationFile()
needs to work
properly without the package being installed. Most common causes of failure
occur when trying to use helper functions like packageVersion() or packageDate()
instead of using meta$Version or meta$Date. See R documentation for more
information.
BiocCheck
Contributions to BiocCheck
are welcome and encouraged through pull requests.
Please adhere to the Pull Request template when submitting your contributions.
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.