sum2: Fast sum over subset of vector elements

View source: R/sum2.R

sum2R Documentation

Fast sum over subset of vector elements

Description

Computes the sum of all or a subset of values.

Usage

sum2(x, idxs = NULL, na.rm = FALSE, mode = typeof(x), ...)

Arguments

x

An NxK matrix or, if dim. is specified, an N * K vector.

idxs

A vector indicating subset of elements to operate over. If NULL, no subsetting is done.

na.rm

If TRUE, missing values are excluded.

mode

A character string specifying the data type of the return value. Default is to use the same mode as argument x, unless it is logical when it defaults to "integer".

...

Not used.

Details

sum2(x, idxs) gives equivalent results as sum(x[idxs]), but is faster and more memory efficient since it avoids the actual subsetting which requires copying of elements and garbage collection thereof.

Furthermore, sum2(x, mode = "double") is equivalent to sum(as.numeric(x)) and may therefore be used to avoid integer overflow(*), but at the same time is much more memory efficient that the regular sum() function when x is an integer vector.

(*) In R (>= 3.5.0), sum(x) will no longer integer overflow and return NA_integer_. Instead it will return the correct sum in form of a double value.

Value

Returns a scalar of the data type specified by argument mode. If mode = "integer", then integer overflow occurs if the sum is outside the range of defined integer values. Note that the intermediate sum (sum(x[1:n])) is internally represented as a floating point value and will therefore never be outside of the range. If mode = "integer" and typeof(x) == "double", then a warning is generated.

Author(s)

Henrik Bengtsson

See Also

sum(). To efficiently average over a subset, see mean2().

Examples

x <- 1:10
n <- length(x)

idxs <- seq(from = 1, to = n, by = 2)
s1 <- sum(x[idxs])                     # 25
s2 <- sum2(x, idxs = idxs)             # 25
stopifnot(identical(s1, s2))

idxs <- seq(from = n, to = 1, by = -2)
s1 <- sum(x[idxs])                     # 25
s2 <- sum2(x, idxs = idxs)             # 25
stopifnot(identical(s1, s2))

s1 <- sum(x)                           # 55
s2 <- sum2(x)                          # 55
stopifnot(identical(s1, s2))


# Total gives integer overflow
x <- c(.Machine$integer.max, 1L, -.Machine$integer.max)
s1 <- sum(x[1:2])                      # NA_integer_ in R (< 3.5.0)
s2 <- sum2(x[1:2])                     # NA_integer_

# Total gives integer overflow (coerce to numeric)
s1 <- sum(as.numeric(x[1:2]))          # 2147483648
s2 <- sum2(as.numeric(x[1:2]))         # 2147483648
s3 <- sum2(x[1:2], mode = "double")    # 2147483648 w/out copy
stopifnot(identical(s1, s2))
stopifnot(identical(s1, s3))

# Cumulative sum would give integer overflow but not the total
s1 <- sum(x)                           # 1L
s2 <- sum2(x)                          # 1L
stopifnot(identical(s1, s2))

matrixStats documentation built on Sept. 11, 2024, 5:24 p.m.