View source: R/optimal_ngroups.R
optimal_ngroups | R Documentation |
Determine the optimal number of groups for a feature.
optimal_ngroups(
pd,
lambda,
max_ngrps = 15,
search_grid = seq_len(min(length(unique(pd$y)), max_ngrps))
)
pd |
Data frame containing the partial dependence effect as returned by
|
lambda |
The complexity parameter in the penalized loss function (see the accompanying research paper or R vignette for details on this aspect). |
max_ngrps |
Integer specifying the maximum number of groups that each feature's values/levels are allowed to be grouped into. |
search_grid |
Integer vector containing the grid of values to evaluate for the number of groups. |
Integer specifying the optimal number of groups. When multiple groupings lead to the lowest loss, the smallest value is returned.
## Not run:
data('mtpl_be')
features <- setdiff(names(mtpl_be), c('id', 'nclaims', 'expo', 'long', 'lat'))
set.seed(12345)
gbm_fit <- gbm::gbm(as.formula(paste('nclaims ~',
paste(features, collapse = ' + '))),
distribution = 'poisson',
data = mtpl_be,
n.trees = 50,
interaction.depth = 3,
shrinkage = 0.1)
gbm_fun <- function(object, newdata) mean(predict(object, newdata, n.trees = object$n.trees, type = 'response'))
gbm_fit %>% get_pd(var = 'ageph',
grid = 'ageph' %>% get_grid(data = mtpl_be),
data = mtpl_be,
subsample = 10000,
fun = gbm_fun) %>%
optimal_ngroups(lambda = 0.00001)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.