EnsemForest: Ensemble Forest (EF)
In ellenxtan/ifedtree: Tree-based Federated Learning Approach for Personalized Effect Estimation and Prediction

EnsemForest

R Documentation

Ensemble Forest (EF)

Description

Build and/or predict on an ensemble regression forest. Two implementation are provided for fitting the forest. One treats each site as a distinct factor (implemented with ranger); Another uses mean encoding for site index (implemented with grf).

Usage

EnsemForest(
  coord_id,
  aug_df,
  site,
  covars,
  honest = FALSE,
  is_pred = FALSE,
  is_encode = FALSE,
  myfit = NULL,
  est_leaves = NULL,
  honest_y = NULL,
  site_enc_tab = NULL,
  ...
)

Arguments

`coord_id`	Site index for coordinating site.
`aug_df`	The augmented data frame used to fit an ensemble forest ('data.table').
`site`	Variable name for site indicator.
`covars`	A vector of covariate names used.
`honest`	Whether to use honest splitting (i.e., subsample splitting). Default is FALSE.
`is_pred`	Whether to build an ensemble forest or make prediction. Default is FALSE.
`is_encode`	Whether to treat each site as a distinct factor or use mean encoding as surrogate for site index. Useful when the number of underlying groups are known. Default is FALSE
`myfit`	A fitted ensemble forest (for prediction purpose). Default is NULL.
`est_leaves`	A matrix of the number of observations in the augmented data times the number of trees for assignment of terminal nodes of each tree for the honest sample/estimation set (for honest prediction purpose). Default is NULL. If "honest" is set to FALSE, "est_leaves" is NULL; If both "is_pred" and "honest" are TRUE, "est_leaves" should not be NULL.
`honest_y`	A vector of honest estimates for the honest sample/estimation set (for honest prediction purpose). Default is NULL. If "honest" is set to FALSE, "honest_y" is NULL; If both "is_pred" and "honest" are TRUE, "honest_y" should not be NULL.
`site_enc_tab`	A data.table of mean outcome for each site. Default is NULL. If both "is_pred" and "is_encode" are set to TRUE, "site_enc_tab" should not be NULL.
`\dots`	Additional arguments for building the forest.

Value

Training: return a fitted ensemble forest and OOB predictions of the input data; Prediction: return predictions of the input data.

Examples

data(SimDataLst)
K <- length(SimDataLst)
covars <- grep("^X", names(SimDataLst[[1]]), value=TRUE)
fit_lst <- list()
for (k in 1:K) {
    tmpdf <- SimDataLst[[k]]
    # use your estimator of interest
    fit_lst[[k]] <- grf::causal_forest(X=as.matrix(tmpdf[, covars, with=FALSE]),
                                       Y=tmpdf$Y, W=tmpdf$Z)
}

coord_id <- 1
coord_test <- GenSimData(coord_id)

coord_df <- SimDataLst[[coord_id]]
aug_df <- GenAugData(coord_id, coord_df, fit_lst, covars)

## Treat each site as a distinct factor
res_ef <- EnsemForest(coord_id, aug_df, "site", covars)
ef_hat <- EnsemForest(coord_id, coord_test, "site", covars, is_pred=TRUE,
            myfit=res_ef$myfit, est_leaves=res_ef$est_leaves, honest_y=res_ef$honest_y)

res_ef <- EnsemForest(coord_id, aug_df, "site", covars, honest=TRUE)
ef_hat <- EnsemForest(coord_id, coord_test, "site", covars, honest=TRUE, is_pred=TRUE,
            myfit=res_ef$myfit, est_leaves=res_ef$est_leaves, honest_y=res_ef$honest_y)

## Mean encoding as surrogate for site index
res_ef <- EnsemForest(coord_id, aug_df, "site", covars, is_encode=TRUE)
ef_hat <- EnsemForest(coord_id, coord_test, "site", covars, is_pred=TRUE, is_encode=TRUE,
            myfit=res_ef$myfit, site_enc_tab=res_ef$site_enc_tab)

res_ef <- EnsemForest(coord_id, aug_df, "site", covars, honest=TRUE, is_encode=TRUE)
ef_hat <- EnsemForest(coord_id, coord_test, "site", covars, honest=TRUE,
            is_pred=TRUE, is_encode=TRUE,
            myfit=res_ef$myfit, site_enc_tab=res_ef$site_enc_tab)

ellenxtan/ifedtree documentation built on March 28, 2023, 9:09 a.m.

ellenxtan/ifedtree index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ellenxtan/ifedtree
Tree-based Federated Learning Approach for Personalized Effect Estimation and Prediction

EnsemForest: Ensemble Forest (EF)
In ellenxtan/ifedtree: Tree-based Federated Learning Approach for Personalized Effect Estimation and Prediction

Ensemble Forest (EF)

Description

Usage

Arguments

Value

Examples

Related to EnsemForest in ellenxtan/ifedtree...

R Package Documentation

Browse R Packages

We want your feedback!

ellenxtan/ifedtree Tree-based Federated Learning Approach for Personalized Effect Estimation and Prediction

EnsemForest: Ensemble Forest (EF) In ellenxtan/ifedtree: Tree-based Federated Learning Approach for Personalized Effect Estimation and Prediction

Ensemble Forest (EF)

Description

Usage

Arguments

Value

Examples

Related to EnsemForest in ellenxtan/ifedtree...

R Package Documentation

Browse R Packages

We want your feedback!

ellenxtan/ifedtree
Tree-based Federated Learning Approach for Personalized Effect Estimation and Prediction

EnsemForest: Ensemble Forest (EF)
In ellenxtan/ifedtree: Tree-based Federated Learning Approach for Personalized Effect Estimation and Prediction