tidy_cf: Gather causal forest outputs into a data frame
In ensley-nexant/cfeval: Causal Forest Evaluation and Visualization

Description Usage Arguments Details Value See Also Examples

Having the out-of-bag prediction results in a tidy, tabular format makes visualization much easier.

1	tidy_cf(fit, preds = NULL)

`fit`	A trained causal forest object from `causal_forest`
`preds`	Out-of-bag training predictions from `fit`, If omitted, they will be generated, but this will slow down the function significantly.

debiased.error and excess.error serve to partition the overall prediction error into two parts. debiased.error is "irreducible" error in a sense because it cannot be made smaller by increasing the number of trees in the forest. excess.error can, however. The grf authors recommend growing enough trees that excess.error becomes negligible.

A tibble containing the following columns:

W: The original treatment assignments.
W.hat: The estimated treatment propensities: W.hat = E[W | X].
Y: The original outcome variable.
Y.hat: The expected response estimates, marginalized over treatment: Y.hat = E[Y | X].
treatment: The treatment assignments as a factor, "Control" or "Treated". This looks better in plots than W does.
cate: The conditional average treatment effect (CATE) estimates
cate.se: The standard errors of the CATEs.
debiased.error: An estimate of the error obtained if the forest had an infinite number of trees.
excess.error: A jackknife estimate of how unstable the estimates are if forests of the same size were repeatedly grown on the same data set.
IPW: The inverse propensity weights: 1 / W.hat if W = 1, 1 / (1 - W.hat) otherwise.
bias: A measure of each observation's contribution to the overall bias of the model, relative to a simple difference in means.

https://grf-labs.github.io/grf/articles/diagnostics.html#assessing-fit for a discussion of the bias measure and how it is calculated.

## Not run: 
 require(grf)

 n <- 2000; p <- 10

 X <- matrix(rnorm(n * p), n, p)
 W <- rbinom(n, 1, 0.4 + 0.2 * (X[, 1] > 0))
 Y <- pmax(X[, 1], 0) * W + X[, 2] + pmin(X[, 3], 0) + rnorm(n)
 cf <- causal_forest(X, Y, W)
 preds <- predict(cf, estimate.variance = T)

 tidy_cf(cf, preds)

## End(Not run)