chainsubset: Chain subset conditions

chainsubsetR Documentation

Chain subset conditions

Description

Chain subset conditions

Usage

chainsubset(..., out.vars)

Arguments

...

Logical conditions to be chained.

out.vars

character. Variables not in data.frame, only needed if you use variables which are not in the frame. If out.vars is not specified, it is assumed to match all variables starting with a dot ('.').

Details

A set of logical conditions are chained, not and'ed. That is, each argument to chainsubset is used as a filter to create a smaller dataset. Each subsequent argument filters further. For independent conditions this will be the same as and'ing them. I.e. chainsubset(x < 0 , y < 0) will yield the same subset as (x < 0) & (y < 0). However, for aggregate filters like chainsubset(x < mean(y), x > mean(y)) we first find all the observations with x < mean(y), then among these we find the ones with x > mean(y). The last mean(y) is now conditional on x < mean(y).

Value

Expression that can be eval'ed to yield a logical subset mask.

Note

Some trickery is done to make this work directly in the subset argument of functions like felm() and lm(). It might possibly fail with an error message in some situations. If this happens, it should be done in two steps: ss <- eval(chainsubset(...),data); lm(...,data=data, subset=ss). In particular, the arguments are taken literally, constructions like function(...) {chainsubset(...)} or a <- quote(x < y); chainsubset(a) do not work, but do.call(chainsubset,list(a)) does.

Examples

set.seed(48)
N <- 10000
dat <- data.frame(y = rnorm(N), x = rnorm(N))
# It's not the same as and'ing the conditions:
felm(y ~ x, data = dat, subset = chainsubset(x < mean(y), y < 2 * mean(x)))
felm(y ~ x, data = dat, subset = chainsubset(y < 2 * mean(x), x < mean(y)))
felm(y ~ x, data = dat, subset = (x < mean(y)) & (y < 2 * mean(x)))
lm(y ~ x, data = dat, subset = chainsubset(x < mean(y), x > mean(y)))

lfe documentation built on Feb. 16, 2023, 7:32 p.m.