het_vimp | R Documentation |
cjbart
ObjectEstimates random forest variable importance scores for multiple attribute-levels of a conjoint experiment.
het_vimp(imces, levels = NULL, covars = NULL, cores = 1, ...)
imces |
Object of class |
levels |
An optional vector of attribute-levels to generate importance metrics for. By default, all attribute-levels are analyzed. |
covars |
An optional vector of covariates to include in the importance metric check. By default, all covariates are included in each importance model. |
cores |
Number of CPU cores used during VIMP estimation. Each extra core will result in greater memory consumption. Assigning more cores than outcomes will not further boost performance. |
... |
Extra arguments (used to check for deprecated argument names) |
Having generated a schedule of individual-level marginal component effect estimates, this function fits a random forest model for each attribute-level using the supplied covariates as predictors. It then calculates a variable importance measure (VIMP) for each covariate. The VIMP method assesses how important each covariate is in terms of partitioning the predicted individual-level effects distribution, and can thus be used as an indicator of which variables drive heterogeneity in the IMCEs.
To recover a VIMP measure, we used permutation-based importance metrics recovered from random forest models estimated using randomForestSRC::rfsrc()
. To permute the data, this function uses random node assignment, whereby cases are randomly assigned to a daughter node whenever a tree splits on the target variable \insertCite@see @ishwaran2008randomcjbart. Importance is defined in terms of how random node assignment degrades the performance of the forest. Higher degradation indicates a variable is more important to prediction.
Variance estimates of each variable's importance are subsequently recovered using the delete-d jackknife estimator developed by \insertCiteishwaran2019standard;textualcjbart. The jackknife method has inherent bias correction properties, making it particularly effective for variable selection exercises such as identifying drivers of heterogeneity.
A "long" data.frame of variable importance scores for each combination of covariates and attribute-levels, as well as the estimated 95% confidence intervals for each metric.
randomForestSRC::rfsrc()
and randomForestSRC::subsample()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.