Description Usage Arguments Value
Version 3 CV with RFE (recursive feature selection using SHAP),
use random search by xgboost.dart.cvtune
.
1 2 3 | run.k.fold.cv.rfe.wrap(modeldt1, sat, y_var, features0, k_fold = 5,
stn_var = NULL, day_var = NULL, run_param_cv = T, run_rfe = T,
predict_whole_model = T)
|
modeldt1 |
the dataset |
sat |
Name of the satellite, just for labelling purpose, for example "terra" |
y_var |
the y variables, for example y_var = "AOD_diff" |
features0 |
the features to use |
k_fold |
default to 5 k_fold cross-validation |
stn_var |
the variable presenting stations if cv by stn |
day_var |
the variable presenting dayint if cv by day, if provide both, will run cv by station. day_var should be date, or int. |
run_param_cv |
run parameter search or not, default to TRUE, but set to FALSE during the RFE process, could be changed to TRUE in the code to allow parameter tuning during RFE process. May not be necessary. |
run_rfe |
default to TRUE, run RFE |
predict_whole_model |
default to TRUE, fit the whole model in the end using selected features from RFE |
a list of object:
sat and by: record of satellite and cv by what
time_spent: time spent to run the cv
bin_list: how was the bins divided
modeldt1_wPred: dt of the data with predicted values from cv
rmse_all_folds: overall rmse from cv
features_rank_rfe: the features ranked by RFE
features_rank_rfe_record: what were selected during each round of RFE
shap_score: the shap values dataset from cv
xgb_param_list1: each folder's hyperparameters in list
xgb_param_list2 = unlist(xgb_param_list1)
rmse_rfe: rmse from each round of RFE, used to select features
var_selected_rfe: features selected using RFE, used to fit the whole model
xgb_param_dart_whole_model
features_rank_full_model: SHAP rank of features if just run the whole model not based on RFE
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.