filter_table: Table rows filtering

Description Usage Arguments Details Value Examples

View source: R/filter_table.R

Description

Return a logical vector for filtering the table's rows following these steps:

  1. Calculate summary values for the rows, either for the whole row or for each group;

  2. Apply 'test' to the summarized values, comparing it to reference value(s) or using a custom function;

  3. If test was applied by group, reduce the logical result by row-wise all, any or count.

Usage

1
2
3
4
filter_table(x, summary = c("mean", "median", "sd", "max", "min", "sum",
  "prod", "all", "any", "count"), test = c("==", "!=", ">", "<", ">=", "<="),
  ..., MoreArgs = NULL, method = c("row", "all", "any", "count"),
  group = NULL, na.rm = FALSE, drop = NULL)

Arguments

x

A matrix, data frame or data table.

summary

An expression with the summary method and the logical comparison, or the name of a summary method. See Details. Defaults to "sum".

test

A single character string representing a logical comparison, or a funtion that will be called with the summary vector as the first argument. Defaults to "==".

...

A value to be compared to the summarized variables, or further potential arguments passed to test function. If testing by group, it should be a single value or a vector of the same length as the number of groups. In this case also, test is applied using mapply, so extra arguments to a custom function may be passed through MoreArgs.

MoreArgs

A list of arguments passed to test by mapply.

method

Method to obtain logical test results by row. Defaults to "row".

group

A <e2><80><98>factor<e2><80><99> in the sense that as.factor(group) defines the grouping, or a list of such factors in which case their interaction is used for the grouping.

na.rm

If TRUE, NAs are excluded first, otherwise not.

drop

Character vector of column names to be ignored.

Details

This function can be called in two different ways. The first and simpler one is using an expression:

filter_table(x, mean > 0) # TRUE if the row mean is greater than 0

The second, more flexible form, is using the individual parameters:

filter_table(x, summary='mean', test='>', 0) # argument names can be ommited

Value

A logical vector indicating if each row passed the test.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
data(geneData, geneCovariate, package='Biobase')
gData <- head(geneData, n=100)

# Select if standard deviation of log expression level is greater than 0.5.
variant <- filter_table(log2(gData), sd > 0.5, na.rm=TRUE)

# Consider expressed if the median value is positive in at least one group.
expressed <- filter_table(gData, median > 0, method='any',
                          group=geneCovariate[c('sex', 'type')])

filtered <- gData[variant & expressed, ]
nrow(filtered)

csbl-usp/collapseR documentation built on May 6, 2019, 8:32 p.m.