#' Get Number of Available Cores on The Current Machine
#'
#' The current/main \R session counts as one, meaning the minimum
#' number of cores available is always at least one.
#'
#' @param constraints An optional character specifying under what
#' constraints ("purposes") we are requesting the values.
#' For instance, on systems where multicore processing is not supported
#' (i.e. Windows), using `constrains = "multicore"` will force a
#' single core to be reported.
#'
#' @param methods A character vector specifying how to infer the number
#' of available cores.
#'
#' @param na.rm If TRUE, only non-missing settings are considered/returned.
#'
#' @param logical Passed to
#' \code{\link[parallel]{detectCores}(logical = logical)}, which, if supported,
#' returns the number of logical CPUs (TRUE) or physical CPUs/cores (FALSE).
#' This argument is only if argument `methods` includes `"system"`.
#'
#' @param default The default number of cores to return if no non-missing
#' settings are available.
#'
#' @param which A character specifying which settings to return.
#' If `"min"` (default), the minimum value is returned.
#' If `"max"`, the maximum value is returned (be careful!)
#' If `"all"`, all values are returned.
#'
#' @param omit (integer; non-negative) Number of cores to not include.
#'
#' @return Return a positive (>= 1) integer.
#' If `which = "all"`, then more than one value may be returned.
#' Together with `na.rm = FALSE` missing values may also be returned.
#'
#' @details
#' The following settings ("methods") for inferring the number of cores
#' are supported:
#' \itemize{
#' \item `"system"` -
#' Query \code{\link[parallel]{detectCores}(logical = logical)}.
#' \item `"nproc"` -
#' On Unix, query system command \code{nproc}.
#' \item `"mc.cores"` -
#' If available, returns the value of option
#' \code{\link[base:options]{mc.cores}}.
#' Note that \option{mc.cores} is defined as the number of
#' _additional_ \R processes that can be used in addition to the
#' main \R process. This means that with `mc.cores = 0` all
#' calculations should be done in the main \R process, i.e. we have
#' exactly one core available for our calculations.
#' The \option{mc.cores} option defaults to environment variable
#' \env{MC_CORES} (and is set accordingly when the \pkg{parallel}
#' package is loaded). The \option{mc.cores} option is used by for
#' instance \code{\link[=mclapply]{mclapply}()} of the \pkg{parallel}
#' package.
#' \item `"BiocParallel"` -
#' Query environment variables \env{BIOCPARALLEL_WORKER_NUMBER} (integer),
#' which is defined by **BiocParallel** (>= 1.27.2), and \env{BBS_HOME}
#' (logical). If the former is set, this is the number of cores considered.
#' If the latter is set, then a maximum of 4 cores is considered.
#' \item `"PBS"` -
#' Query TORQUE/PBS environment variables \env{PBS_NUM_PPN} and \env{NCPUS}.
#' Depending on PBS system configuration, these _resource_
#' parameters may or may not default to one.
#' An example of a job submission that results in this is
#' `qsub -l nodes=1:ppn=2`, which requests one node with two cores.
#' \item `"SGE"` -
#' Query Sun/Oracle Grid Engine (SGE) environment variable
#' \env{NSLOTS}.
#' An example of a job submission that results in this is
#' `qsub -pe smp 2` (or `qsub -pe by_node 2`), which
#' requests two cores on a single machine.
#' \item `"Slurm"` -
#' Query Simple Linux Utility for Resource Management (Slurm)
#' environment variable \env{SLURM_CPUS_PER_TASK}.
#' This may or may not be set. It can be set when submitting a job,
#' e.g. `sbatch --cpus-per-task=2 hello.sh` or by adding
#' `#SBATCH --cpus-per-task=2` to the \file{hello.sh} script.
#' If \env{SLURM_CPUS_PER_TASK} is not set, then it will fall back to
#' use \env{SLURM_CPUS_ON_NODE} if the job is a single-node job
#' (\env{SLURM_JOB_NUM_NODES} is 1), e.g. `sbatch --ntasks=2 hello.sh`.
#' \item `"LSF"` -
#' Query Platform Load Sharing Facility (LSF) environment variable
#' \env{LSB_DJOB_NUMPROC}.
#' Jobs with multiple (CPU) slots can be submitted on LSF using
#' `bsub -n 2 -R "span[hosts=1]" < hello.sh`.
#' \item `"custom"` -
#' If option \option{parallelly.availableCores.custom} is set and a function,
#' then this function will be called (without arguments) and it's value
#' will be coerced to an integer, which will be interpreted as a number
#' of available cores. If the value is NA, then it will be ignored.
#' }
#' For any other value of a `methods` element, the \R option with the
#' same name is queried. If that is not set, the system environment
#' variable is queried. If neither is set, a missing value is returned.
#'
#' @section Avoid ending up with zero cores:
#' Note that some machines might have a limited number of cores, or the R
#' process runs in a container or a cgroup that only provides a small number
#' of cores. In such cases:
#'
#' ```r
#' ncores <- availableCores() - 1
#' ```
#'
#' may return zero, which is often not intended and is likely to give an
#' error downstream. Instead, use:
#'
#' ```r
#' ncores <- availableCores(omit = 1)
#' ```
#'
#' to put aside one of the cores from being used. Regardless how many cores
#' you put aside, this function is guaranteed to return at least one core.
#'
#' @section Advanced usage:
#' It is possible to override the maximum number of cores on the machine
#' as reported by `availableCores(methods = "system")`. This can be
#' done by first specifying
#' `options(parallelly.availableCores.methods = "mc.cores")` and
#' then the number of cores to use, e.g. `options(mc.cores = 8)`.
#'
#' @examples
#' message(paste("Number of cores available:", availableCores()))
#'
#' \dontrun{
#' options(mc.cores = 2L)
#' message(paste("Number of cores available:", availableCores()))
#' }
#'
#' \dontrun{
#' ## IMPORTANT: availableCores() may return 1L
#' options(mc.cores = 1L)
#' ncores <- availableCores() - 1 ## ncores = 0
#' ncores <- availableCores(omit = 1) ## ncores = 1
#' message(paste("Number of cores to use:", ncores))
#' }
#'
#' \dontrun{
#' ## Use 75% of the cores on the system but never more than four
#' options(parallelly.availableCores.custom = function() {
#' ncores <- max(parallel::detectCores(), 1L, na.rm = TRUE)
#' ncores <- min(as.integer(0.75 * ncores), 4L)
#' max(1L, ncores)
#' })
#' message(paste("Number of cores available:", availableCores()))
#'
#' ## What is available minus one core but at least one
#' options(parallelly.availableCores.custom = function() {
#' max(1L, parallelly::availableCores() - 1L)
#' })
#' message(paste("Number of cores available:", availableCores()))
#' }
#'
#' @seealso
#' To get the set of available workers regardless of machine,
#' see [availableWorkers()].
#'
#' @importFrom parallel detectCores
#' @export
availableCores <- function(constraints = NULL, methods = getOption2("parallelly.availableCores.methods", c("system", "nproc", "mc.cores", "BiocParallel", "_R_CHECK_LIMIT_CORES_", "PBS", "SGE", "Slurm", "LSF", "fallback", "custom")), na.rm = TRUE, logical = getOption2("parallelly.availableCores.logical", TRUE), default = c(current = 1L), which = c("min", "max", "all"), omit = getOption2("parallelly.availableCores.omit", 0L)) {
## Local functions
getenv <- function(name, mode = "integer") {
value <- trim(getEnvVar2(name, default = NA_character_))
storage.mode(value) <- mode
value
} # getenv()
getopt <- function(name, mode = "integer") {
value <- getOption2(name, default = NA_integer_)
storage.mode(value) <- mode
value
} # getopt()
which <- match.arg(which, choices = c("min", "max", "all"))
stop_if_not(length(default) == 1, is.finite(default), default >= 1L)
stop_if_not(length(omit) == 1L, is.numeric(omit),
is.finite(omit), omit >= 0L)
omit <- as.integer(omit)
ncores <- rep(NA_integer_, times = length(methods))
names(ncores) <- methods
for (kk in seq_along(methods)) {
method <- methods[kk]
if (method == "Slurm") {
## Number of cores assigned by Slurm
## The assumption is that the following works regardless of
## number of nodes requested /HB 2020-09-18
## Example: --cpus-per-task={n}
n <- getenv("SLURM_CPUS_PER_TASK")
if (is.na(n)) {
## Example: --nodes={nnodes} (defaults to 1, short: -N {nnodes})
## From 'man sbatch':
## SLURM_JOB_NUM_NODES (and SLURM_NNODES for backwards compatibility)
## Total number of nodes in the job's resource allocation.
nnodes <- getenv("SLURM_JOB_NUM_NODES")
if (is.na(nnodes)) nnodes <- getenv("SLURM_NNODES")
if (is.na(nnodes)) nnodes <- 1L ## Can this happen? /HB 2020-09-18
if (nnodes == 1L) {
## Example: --nodes=1 --ntasks={n} (short: -n {n})
## IMPORTANT: 'SLURM_CPUS_ON_NODE' appears to be rounded up when nodes > 1.
## Example 1: With --nodes=2 --cpus-per-task=3 we see SLURM_CPUS_ON_NODE=4
## although SLURM_CPUS_PER_TASK=3.
## Example 2: With --nodes=2 --ntasks=7, we see SLURM_CPUS_ON_NODE=6,
## SLURM_JOB_CPUS_PER_NODE=6,2, no SLURM_CPUS_PER_TASK, and
## SLURM_TASKS_PER_NODE=5,2.
## Conclusions: We can only use 'SLURM_CPUS_ON_NODE' for nnodes = 1.
n <- getenv("SLURM_CPUS_ON_NODE")
} else {
## Parse `SLURM_TASKS_PER_NODE`
nodecounts <- getenv("SLURM_TASKS_PER_NODE", mode = "character")
if (!is.na(nodecounts)) {
## Examples:
## SLURM_TASKS_PER_NODE=5,2
## SLURM_TASKS_PER_NODE=2(x2),1(x3) # Source: 'man sbatch'
n <- slurm_expand_nodecounts(nodecounts)
if (any(is.na(n))) next
## ASSUMPTION: We assume that it is the first component on the list that
## corresponds to the current machine. /HB 2021-03-05
n <- n[1]
}
}
}
## TODO?: Can we validate above assumptions/results? /HB 2020-09-18
if (FALSE && !is.na(n)) {
## Is any of the following useful?
## Example: --ntasks={ntasks} (no default, short: -n {ntasks})
## From 'man sbatch':
## SLURM_NTASKS (and SLURM_NPROCS for backwards compatibility)
## Same as -n, --ntasks
ntasks <- getenv("SLURM_NTASKS")
if (is.na(ntasks)) ntasks <- getenv("SLURM_NPROCS")
}
} else if (method == "PBS") {
## Number of cores assigned by TORQUE/PBS
n <- getenv("PBS_NUM_PPN")
if (is.na(n)) {
## PBSPro sets 'NCPUS' but not 'PBS_NUM_PPN'
n <- getenv("NCPUS")
}
} else if (method == "SGE") {
## Number of cores assigned by Sun/Oracle Grid Engine (SGE)
n <- getenv("NSLOTS")
} else if (method == "LSF") {
## Number of slots assigned by LSF
n <- getenv("LSB_DJOB_NUMPROC")
} else if (method == "mc.cores") {
## Number of cores by option defined by 'parallel' package
n <- getopt("mc.cores")
if (!is.na(n) && n == 0) n <- 1L ## Because options(mc.cores = 0) may be set
} else if (method == "mc.cores+1") {
## Number of cores by option defined by 'parallel' package
n <- getopt("mc.cores") + 1L
} else if (method == "BiocParallel") {
n <- getenv("BIOCPARALLEL_WORKER_NUMBER")
if (nzchar(Sys.getenv("BBS_HOME"))) n <- min(n, 4L, na.rm = TRUE)
} else if (method == "_R_CHECK_LIMIT_CORES_") {
## A flag set by R CMD check for constraining number of
## cores allowed to be use in package tests. Here we
## acknowledge this and sets number of cores to the
## maximum two allowed. This way we don't have to explicitly
## use options(mc.cores = 2L) in example code, which may be
## misleading to the reader.
chk <- tolower(Sys.getenv("_R_CHECK_LIMIT_CORES_", ""))
chk <- (nzchar(chk) && (chk != "false"))
n <- if (chk) 2L else NA_integer_
} else if (method == "system") {
## Number of cores available according to parallel::detectCores()
n <- detectCores(logical = logical)
} else if (method == "nproc") {
## Number of cores according Unix 'nproc'
n <- getNproc()
} else if (method == "fallback") {
## Number of cores available according to parallelly.availableCores.fallback
n <- getOption2("parallelly.availableCores.fallback", NA_integer_)
n <- as.integer(n)
} else if (method == "custom") {
fcn <- getOption2("parallelly.availableCores.custom", NULL)
if (!is.function(fcn)) next
n <- local({
## Avoid calling the custom function recursively
oopts <- options(parallelly.availableCores.custom = NULL)
on.exit(options(oopts))
fcn()
})
n <- as.integer(n)
if (length(n) != 1L) {
stop("Function specified by option 'parallelly.availableCores.custom' does not a single value")
}
} else {
## covr: skip=3
## Fall back to querying option and system environment variable
## with the given name
n <- getopt(method)
if (is.na(n)) n <- getenv(method)
}
ncores[kk] <- n
}
## Validate settings
ncoresT <- ncores[!is.na(ncores)]
ncoresT <- ncoresT[ncoresT <= 0]
if (length(ncoresT) > 0) {
msg <- sprintf("Detected invalid (zero or less) core settings: %s",
paste(paste0(sQuote(names(ncoresT)), " = ", ncoresT), collapse = ", "))
mdebug(msg)
stop(msg)
}
## Drop missing values?
if (na.rm) {
ncores <- ncores[!is.na(ncores)]
}
## Fall back to the default?
if (length(ncores) == 0) ncores <- default
## Keep only one
if (length(ncores) >= 2 && (which %in% c("min", "max"))) {
## SPECIAL: The 'fallback' should only be used as a fallback if no other
## options are explicitly set / available.
idx_fallback <- which(names(ncores) == "fallback")
if (length(idx_fallback) == 1) {
## Use only if 'system' and 'nproc' are the only other options
ignore <- c("system", "nproc")
if (length(setdiff(names(ncores), c("fallback", ignore))) == 0) {
ncores <- ncores[idx_fallback]
} else {
## ... otherwise, ignore 'fallback'.
ncores <- ncores[-idx_fallback]
}
}
if (which == "min") {
## which.min() to preserve name
ncores <- ncores[which.min(ncores)]
} else if (which == "max") {
## which.max() to preserve name
ncores <- ncores[which.max(ncores)]
}
}
if (!is.null(constraints)) {
if (constraints == "multicore") {
## SPECIAL: On some OSes such as Windows, multicore processing
## is not supported. If so, we should override all values to
## to reflect that only a single core is available
if (!supportsMulticore()) ncores[] <- 1L
}
}
## Omit some of the cores?
if (omit > 0L) {
ncores <- ncores - omit
ncores[ncores < 1L] <- 1L
}
## Sanity check
stop_if_not(all(ncores >= 1L, na.rm = TRUE))
ncores
} # availableCores()
getNproc <- function(ignore = c("OMP_NUM_THREADS", "OMP_THREAD_LIMIT")) {
## 'nproc' is limited by 'OMP_NUM_THREADS' and 'OMP_THREAD_LIMIT', if set.
## However, that is not what we want for availableCores(). Because of
## this, we unset those while querying 'nproc'.
if (length(ignore) > 0) {
ignore <- intersect(ignore, names(Sys.getenv()))
if (length(ignore) > 0) {
oignore <- Sys.getenv(ignore, names = TRUE)
oignore <- as.list(oignore)
on.exit(do.call(Sys.setenv, args = oignore), add = TRUE)
Sys.unsetenv(ignore)
}
}
systems <- list(linux = "nproc 2>/dev/null")
os <- names(systems)
m <- pmatch(os, table = R.version$os, nomatch = NA_integer_)
m <- os[!is.na(m)]
if (length(m) == 0L) return(NA_integer_)
for (cmd in systems[[m]]) {
tryCatch({
res <- suppressWarnings(system(cmd, intern=TRUE))
res <- gsub("(^[[:space:]]+|[[:space:]]+$)", "", res[1])
if (grepl("^[1-9]$", res)) return(as.integer(res))
}, error = identity)
}
NA_integer_
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.