filter_x: Flexible filtering of matrix/data.frames

View source: R/filter_x.R

filter_xR Documentation

Flexible filtering of matrix/data.frames

Description

Filter a matrix or data.frame based on the sum of values, sum of zeros, or sum of NAs in each row. Optionally filter based on only certain columns. Also optionally perform multiple filtering steps on different sets of columns, wherein the function is vectorised over op, and value, and pattern (see examples).

Usage

filter_x(
  data,
  x = c("na", "zero", "sum"),
  op = c("==", "!=", "<=", ">=", "<", ">"),
  value,
  pattern,
  ...,
  setop
)

Arguments

data

data.frame or matrix.

x

string method, one of either: "na", "zero", or "sum".

op

string or character vector describing operator to use for comparing row sums to a desired value. Possible values: "==", "!=", "<=", ">=", "<", ">".

value

numeric value or vector to compare the row sums against.

pattern

Optional string or character vector containing regular expression(s) to match column names via grep.

...

Other arguments to be passed into grep.

setop

Optional string to indicate the method to combine the rows returned by multiple filtering steps. This must be specified if using a vector for op, value, or pattern.

Value

Returns data containing only the rows matching the specified condition(s).

Examples

mat <- matrix(c(NA, 1:10, 0), nrow = 4, ncol = 3,
dimnames = list(NULL, c("sample1", "sample2", "sample3")))

df <- data.frame(mat)

# works with data.frame or matrix
filter_x(
  data = mat,
  x = "na",
  op = "==",
  value = 0
)
filter_x(
  data = df,
  x = "na",
  op = "==", value = 0
)

# filter based on sum, sum of NA, or sum of zeros
filter_x(mat, "na", ">=", 1)
filter_x(mat, "sum", ">", 5)
filter_x(mat, "zero", "==", 1)

# perform multiple filtering steps at the same time
# using column name pattern matching, the results being combined with 'setop'
# ('setop' can be &, |, xor which corresponds to AND, OR, SYMMETRIC DIFFERENCE)
# (you can supply multiple 'op', 'value', and 'pattern' values in a vector)
filter_x(mat, "sum", ">", c(4, 12), c("sample[1-2]", "sample[2-3]"), setop = "&")
filter_x(mat, "sum", ">", c(4, 12), c("sample[1-2]", "sample[2-3]"), setop = "|")
filter_x(mat, "sum", ">", c(4, 12), c("sample[1-2]", "sample[2-3]"), setop = "xor")


csdaw/csdmisc documentation built on April 26, 2022, 5:39 a.m.