Collection of Filter Rules

Description

A FilterRules object is a collection of filter rules, which can be either expression or function objects. Rules can be disabled/enabled individually, facilitating experimenting with different combinations of filters.

Details

It is common to split a dataset into subsets during data analysis. When data is large, however, representing subsets (e.g. by logical vectors) and storing them as copies might become too costly in terms of space. The FilterRules class represents subsets as lightweight expression and/or function objects. Subsets can then be calculated when needed (on the fly). This avoids copying and storing a large number of subsets. Although it might take longer to frequently recalculate a subset, it often is a relatively fast operation and the space savings tend to be more than worth it when data is large.

Rules may be either expressions or functions. Evaluating an expression or invoking a function should result in a logical vector. Expressions are often more convenient, but functions (i.e. closures) are generally safer and more powerful, because the user can specify the enclosing environment. If a rule is an expression, it is evaluated inside the envir argument to the eval method (see below). If a function, it is invoked with envir as its only argument. See examples.

Accessor methods

In the code snippets below, x is a FilterRules object.

active(x): Get the logical vector of length length(x), where TRUE for an element indicates that the corresponding rule in x is active (and inactive otherwise). Note that names(active(x)) is equal to names(x).

active(x) <- value: Replace the active state of the filter rules. If value is a logical vector, it should be of length length(x) and indicate which rules are active. Otherwise, it can be either numeric or character vector, in which case it sets the indicated rules (after dropping NA's) to active and all others to inactive. See examples.

Constructor

FilterRules(exprs = list(), ..., active = TRUE): Constructs a FilterRules with the rules given in the list exprs or in .... The initial active state of the rules is given by active, which is recycled as necessary. Elements in exprs may be either character (parsed into an expression), a language object (coerced to an expression), an expression, or a function that takes at least one argument. IMPORTANTLY, all arguments in ... are quote()'d and then coerced to an expression. So, for example, character data is only parsed if it is a literal. The names of the filters are taken from the names of exprs and ..., if given. Otherwise, the character vectors take themselves as their name and the others are deparsed (before any coercion). Thus, it is recommended to always specify meaningful names. In any case, the names are made valid and unique.

Subsetting and Replacement

In the code snippets below, x is a FilterRules object.

x[i]: Subsets the filter rules using the same interface as for Vector.

x[[i]]: Extracts an expression or function via the same interface as for List.

x[[i]] <- value: The same interface as for List. The default active state for new rules is TRUE.

Combining

In the code snippets below, x is a FilterRules object.

append(x, values, after = length(x)): Appends the values FilterRules instance onto x at the index given by after.

c(x, ..., recursive = FALSE): Concatenates the FilterRule instances in ... onto the end of x.

Evaluating

eval(expr, envir = parent.frame(), enclos = if (is.list(envir) || is.pairlist(envir)) parent.frame() else baseenv()): Evaluates a FilterRules instance (passed as the expr argument). Expression rules are evaluated in envir, while function rules are invoked with envir as their only argument. The evaluation of a rule should yield a logical vector. The results from the rule evaluations are combined via the AND operation (i.e. &) so that a single logical vector is returned from eval.

evalSeparately(expr, envir = parent.frame(), enclos = if (is.list(envir) || is.pairlist(envir)) parent.frame() else baseenv()): Evaluates separately each rule in a FilterRules instance (passed as the expr argument). Expression rules are evaluated in envir, while function rules are invoked with envir as their only argument. The evaluation of a rule should yield a logical vector. The results from the rule evaluations are combined into a logical matrix, with a column for each rule. This is essentially the parallel evaluator, while eval is the serial evaluator.

subsetByFilter(x, filter): Evaluates filter on x and uses the result to subset x. The result contains only the elements in x for which filter evaluates to TRUE.

summary(object, subject): Returns an integer vector with the number of elements in subject that pass each rule in object, along with a count of the elements that pass all filters.

Filter Closures

When a closure (function) is included as a filter in a FilterRules object, it is converted to a FilterClosure, which is currently nothing more than a marker class that extends function. When a FilterClosure filter is extracted, there are some accessors and utilities for manipulating it:

params: Gets a named list of the objects that are present in the enclosing environment (without inheritance). This assumes that a filter is constructed via a constructor function, and the objects in the frame of the constructor (typically, the formal arguments) are the parameters of the filter.

Author(s)

Michael Lawrence

See Also

FilterMatrix objects for storing the logical output of a set of FilterRules objects.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
## constructing a FilterRules instance

## an empty set of filters
filters <- FilterRules()
  
## as a simple character vector
filts <- c("peaks", "promoters")
filters <- FilterRules(filts)
active(filters) # all TRUE

## with functions and expressions
filts <- list(peaks = expression(peaks), promoters = expression(promoters),
              find_eboxes = function(rd) rep(FALSE, nrow(rd)))
filters <- FilterRules(filts, active = FALSE)
active(filters) # all FALSE

## direct, quoted args (character literal parsed)
filters <- FilterRules(under_peaks = peaks, in_promoters = "promoters")
filts <- list(under_peaks = expression(peaks),
              in_promoters = expression(promoters))

## specify both exprs and additional args
filters <- FilterRules(filts, diffexp = de)

filts <- c("promoters", "peaks", "introns")
filters <- FilterRules(filts)

## evaluation
df <- DataFrame(peaks = c(TRUE, TRUE, FALSE, FALSE),
                promoters = c(TRUE, FALSE, FALSE, TRUE),
                introns = c(TRUE, FALSE, FALSE, FALSE))
eval(filters, df)
fm <- evalSeparately(filters, df)
identical(filterRules(fm), filters)
summary(fm)
summary(fm, percent = TRUE)
fm <- evalSeparately(filters, df, serial = TRUE)

## set the active state directly
  
active(filters) <- FALSE # all FALSE
active(filters) <- TRUE # all TRUE
active(filters) <- c(FALSE, FALSE, TRUE)
active(filters)["promoters"] <- TRUE # use a filter name
  
## toggle the active state by name or index
  
active(filters) <- c(NA, 2) # NA's are dropped
active(filters) <- c("peaks", NA)