rowSums_if: Row Sums Conditional on Frequency of Observed Values

Description Usage Arguments Details Value See Also Examples

View source: R/quest_functions.R

Description

rowSums_if calculates the sum of every row in a numeric or logical matrix conditional on the frequency of observed data. If the frequency of observed values in that row is less than (or equal to) that specified by ov.min, then NA is returned for that row. It also has the option to return a value other than 0 (e.g., NA) when all rows are NA, which differs from rowSums(x, na.rm = TRUE).

Usage

1
2
3
4
5
6
7
8
rowSums_if(
  x,
  ov.min = 1,
  prop = TRUE,
  inclusive = TRUE,
  impute = TRUE,
  allNA = NA_real_
)

Arguments

x

numeric or logical matrix. If not a matrix, it will be coerced to one.

ov.min

minimum frequency of observed values required per row. If prop = TRUE, then this is a decimal between 0 and 1. If prop = FALSE, then this is a integer between 0 and ncol(x).

prop

logical vector of length 1 specifying whether ov.min should refer to the proportion of observed values (TRUE) or the count of observed values (FALSE).

inclusive

logical vector of length 1 specifying whether the sum should be calculated if the frequency of observed values in a row is exactly equal to ov.min.

impute

logical vector of length 1 specifying if missing values should be imputed with the mean of observed values of x[i, ]. If TRUE (default), this will make sums over the same columns with different amounts of observed data comparable.

allNA

numeric vector of length 1 specifying what value should be returned for rows that are all NA. This is most applicable when ov.min = 0 and inclusive = TRUE. The default is NA, which differs from rowSums with na.rm = TRUE where 0 is returned. Note, the value is overwritten by NA if the frequency of observed values in that row is less than (or equal to) that specified by ov.min.

Details

Conceptually this function is doing: apply(X = x, MARGIN = 1, FUN = sum_if, ov.min = ov.min, prop = prop, inclusive = inclusive). But for computational efficiency purposes it does not because then the observed values conditioning would not be vectorized. Instead, it uses rowSums and then inserts NAs for rows that have too few observed values.

Value

numeric vector of length = nrow(x) with names = rownames(x) providing the sum of each row or NA (or allNA) depending on the frequency of observed values.

See Also

rowMeans_if colSums_if colMeans_if rowSums

Examples

1
2
3
4
5
6
7
8
9
rowSums_if(airquality)
rowSums_if(x = airquality, ov.min = 5, prop = FALSE)
x <- data.frame("x" = c(1, 1, NA), "y" = c(2, NA, NA), "z" = c(NA, NA, NA))
rowSums_if(x)
rowSums_if(x, ov.min = 0)
rowSums_if(x, ov.min = 0, allNA = 0)
identical(x = rowSums(x, na.rm = TRUE),
   y = unname(rowSums_if(x, impute = FALSE, ov.min = 0, allNA = 0))) # identical to
   # rowSums(x, na.rm = TRUE)

quest documentation built on Sept. 10, 2021, 5:07 p.m.

Related to rowSums_if in quest...