tidy_cf: Gather causal forest outputs into a data frame

Description Usage Arguments Details Value See Also Examples

View source: R/utils.R

Description

Having the out-of-bag prediction results in a tidy, tabular format makes visualization much easier.

Usage

1
tidy_cf(fit, preds = NULL)

Arguments

fit

A trained causal forest object from causal_forest

preds

Out-of-bag training predictions from fit, If omitted, they will be generated, but this will slow down the function significantly.

Details

debiased.error and excess.error serve to partition the overall prediction error into two parts. debiased.error is "irreducible" error in a sense because it cannot be made smaller by increasing the number of trees in the forest. excess.error can, however. The grf authors recommend growing enough trees that excess.error becomes negligible.

Value

A tibble containing the following columns:

W

The original treatment assignments.

W.hat

The estimated treatment propensities: W.hat = E[W | X].

Y

The original outcome variable.

Y.hat

The expected response estimates, marginalized over treatment: Y.hat = E[Y | X].

treatment

The treatment assignments as a factor, "Control" or "Treated". This looks better in plots than W does.

cate

The conditional average treatment effect (CATE) estimates

cate.se

The standard errors of the CATEs.

debiased.error

An estimate of the error obtained if the forest had an infinite number of trees.

excess.error

A jackknife estimate of how unstable the estimates are if forests of the same size were repeatedly grown on the same data set.

IPW

The inverse propensity weights: 1 / W.hat if W = 1, 1 / (1 - W.hat) otherwise.

bias

A measure of each observation's contribution to the overall bias of the model, relative to a simple difference in means.

See Also

https://grf-labs.github.io/grf/articles/diagnostics.html#assessing-fit for a discussion of the bias measure and how it is calculated.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
## Not run: 
 require(grf)

 n <- 2000; p <- 10

 X <- matrix(rnorm(n * p), n, p)
 W <- rbinom(n, 1, 0.4 + 0.2 * (X[, 1] > 0))
 Y <- pmax(X[, 1], 0) * W + X[, 2] + pmin(X[, 3], 0) + rnorm(n)
 cf <- causal_forest(X, Y, W)
 preds <- predict(cf, estimate.variance = T)

 tidy_cf(cf, preds)

## End(Not run)

ensley-nexant/cfeval documentation built on May 20, 2020, 12:34 a.m.