In the course of an analysis, many causal forest models may need to be
created, evaluated, and compared to one another. The cfeval
package
aims to streamline that process by providing utility and plotting
functions to perform repetitive tasks common to all of these models.
cfeval
requires the grf
package
to function, and is heavily dependent on its API.
You can install the latest version of cfeval from Github with
devtools::install_github('ensley-nexant/cfeval')
or, if you have located the path to the source tarball, with
install.packages('E:\Path\To\Package\cfeval_x.x.x.tar.gz', repos = NULL, type = 'source')
First create an example dataset and fit a causal forest.
library(cfeval)
n <- 2000; p <- 10
X <- matrix(rnorm(n * p), n, p)
W <- rbinom(n, 1, 0.4 + 0.2 * (X[, 1] > 0))
Y <- pmax(X[, 1], 0) * W + X[, 2] + pmin(X[, 3], 0) + rnorm(n)
cf <- grf::causal_forest(X, Y, W)
Next, create a causal forest evaluation object, passing the trained forest and the original training dataset.
cfe <- cf_eval(cf, X)
The cf_eval
object contains three elements:
dat
, the original training dataset;res
, summarized results from out-of-bag training predictions;varimp
, all covariates used in the forest along with their
relative importances.Note that the trained forest cf
, which can be several gigabytes in
size even for moderately sized problems, is not copied or carried around
by cf_eval
. This improves performance. For actual work, I recommend
compressing the forest and writing it to disk with saveRDS()
, then
loading it for evaluation when required.
One of the many plotting convenience functions offered by cfeval
is to
view the distribution of estimated conditional average treatment
effects.
plot(cfe, kind = 'cate')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.