Vector-valued (statistical) functions operating in parallel over vectors passed as arguments, or a single list of vectors (such as a data frame). Similar to
pmax, except that these functions do not recycle vectors.
psum(..., na.rm = FALSE) pprod(..., na.rm = FALSE) pmean(..., na.rm = FALSE) pfirst(...) # (na.rm = TRUE) plast(...) # (na.rm = TRUE) pall(..., na.rm = FALSE) pallNA(...) pallv(..., value) pany(..., na.rm = FALSE) panyNA(...) panyv(..., value) pcount(..., value) pcountNA(...)
suitable (atomic) vectors of the same length, or a single list of vectors (such as a
A logical indicating whether missing values should be removed. Default value is
pprod work for integer, logical, double and complex types.
pmean only supports integer, logical and double types. All 3 functions will error if used with factors.
plast select the first/last non-missing value (or non-empty or
NULL value for list-vectors). They accept all vector types with defined missing values + lists, but can only jointly handle integer and double types (not numeric and complex or character and factor). If factors are passed, they all need to have identical levels.
pall are derived from base functions
any and only allow logical inputs.
pcount counts the occurrence of
value, and expects arguments of the same data type (except for
value = NA).
pcountNA is equivalent to
value = NA, and they both allow
NA counting in mixed-type data.
pcountNA additionally supports list vectors and counts empty or
NULL elements as
panyv/pallv are wrappers around
panyNA/pallNA are wrappers around
pcountNA. They return a logical vector instead of the integer count.
None of these functions recycle vectors i.e. all input vectors need to have the same length. All functions support long vectors with up to
psum/pprod/pmean return the sum, product or mean of all arguments. The value returned will be of the highest argument type (integer < double < complex).
pprod only returns double or complex.
pany[v/NA] return a logical vector.
pcount[NA] returns an integer vector.
pfirst/plast return a vector of the same type as the inputs.
Morgan Jacob and Sebastian Krantz
Package 'collapse' provides column-wise and scalar-valued analogues to many of these functions.
x = c(1, 3, NA, 5) y = c(2, NA, 4, 1) z = c(3, 4, 4, 1) # Example 1: psum psum(x, y, z, na.rm = FALSE) psum(x, y, z, na.rm = TRUE) # Example 2: pprod pprod(x, y, z, na.rm = FALSE) pprod(x, y, z, na.rm = TRUE) # Example 3: pmean pmean(x, y, z, na.rm = FALSE) pmean(x, y, z, na.rm = TRUE) # Example 4: pfirst and plast pfirst(x, y, z) plast(x, y, z) # Adjust x, y, and z to use in pall and pany x = c(TRUE, FALSE, NA, FALSE) y = c(TRUE, NA, TRUE, TRUE) z = c(TRUE, TRUE, FALSE, NA) # Example 5: pall pall(x, y, z, na.rm = FALSE) pall(x, y, z, na.rm = TRUE) # Example 6: pany pany(x, y, z, na.rm = FALSE) pany(x, y, z, na.rm = TRUE) # Example 7: pcount pcount(x, y, z, value = TRUE) pcountNA(x, y, z) # Example 8: list/data.frame as an input pprod(iris[,1:2]) psum(iris[,1:2]) pmean(iris[,1:2]) # Benchmarks # ---------- # n = 1e8L # x = rnorm(n) # 763 Mb # y = rnorm(n) # z = rnorm(n) # # microbenchmark::microbenchmark( # kit=psum(x, y, z, na.rm = TRUE), # base=rowSums(do.call(cbind,list(x, y, z)), na.rm=TRUE), # times = 5L, unit = "s" # ) # Unit: Second # expr min lq mean median uq max neval # kit 0.52 0.52 0.65 0.55 0.83 0.84 5 # base 2.16 2.27 2.34 2.35 2.43 2.49 5 # # x = sample(c(TRUE, FALSE, NA), n, TRUE) # 382 Mb # y = sample(c(TRUE, FALSE, NA), n, TRUE) # z = sample(c(TRUE, FALSE, NA), n, TRUE) # # microbenchmark::microbenchmark( # kit=pany(x, y, z, na.rm = TRUE), # base=sapply(1:n, function(i) any(x[i],y[i],z[i],na.rm=TRUE)), # times = 5L # ) # Unit: Second # expr min lq mean median uq max neval # kit 1.07 1.09 1.15 1.10 1.23 1.23 5 # base 111.31 112.02 112.78 112.97 113.55 114.03 5
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.