Description Usage Arguments Details Author(s) Examples
gcforest() base on a Python Deep Forest application programming interface (API). Reference https://github.com/pylablanche/gcForest.
1 2 3 4 |
shape_1X |
int or tuple list or np.array (default=None)Shape of a single sample element [n_lines, n_cols]. Required when calling mg_scanning!For sequence data a single int can be given. |
n_mgsRFtree |
int (default=30) Number of trees in a Random Forest during Multi Grain Scanning. |
window |
int (default=None)List of window sizes to use during Multi Grain Scanning. If 'None' no slicing will be done. |
stride |
int (default=1)Step used when slicing the data. |
cascade_test_size |
float or int (default=0.2) Split fraction or absolute number for cascade training set splitting. |
n_cascadeRF |
int (default=2)Number of Random Forests in a cascade layer. For each pseudo Random Forest a complete Random Forest is created, hence the total numbe of Random Forests in a layer will be 2*n_cascadeRF. |
n_cascadeRFtree |
int (default=101) Number of trees in a single Random Forest in a cascade layer. |
cascade_layer |
int (default=np.inf) mMximum number of cascade layers allowed. Useful to limit the contruction of the cascade. |
min_samples_mgs |
float or int (default=0.1) Minimum number of samples in a node to perform a split during the training of Multi-Grain Scanning Random Forest. If int number_of_samples = int. If float, min_samples represents the fraction of the initial n_samples to consider. |
min_samples_cascade |
float or int (default=0.1) Minimum number of samples in a node to perform a split during the training of Cascade Random Forest. If int number_of_samples = int. If float, min_samples represents the fraction of the initial n_samples to consider. |
tolerance |
float (default=0.0) Accuracy tolerance for the casacade growth. If the improvement in accuracy is not better than the tolerance the construction is stopped. |
gcForest provides several important function interfaces, just like the style of Python sklearn.
fit(X,y) Training the gcForest on input data X and associated target y;
predict(X) Predict the class of unknown samples X;
predict_proba(X) Predict the class probabilities of unknown samples X;
mg_scanning(X, y=None) Performs a Multi Grain Scanning on input data;
window_slicing_pred_prob(X, window, shape_1X, y=None) Performs a window slicing of the input data and send them through Random Forests. If target values 'y' are provided sliced data are then used to train the Random Forests;
cascade_forest(X, y=None) Perform (or train if 'y' is not None) a cascade forest estimator;
Xu Jing
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | have_numpy <- reticulate::py_module_available("numpy")
have_sklearn <- reticulate::py_module_available("sklearn")
if(have_numpy && have_sklearn){
library(gcForest)
req_py()
sk <- NULL
.onLoad <- function(libname, pkgname) {
sk <<- reticulate::import("sklearn", delay_load = TRUE)
}
sk <<- reticulate::import("sklearn", delay_load = TRUE)
train_test_split <- sk$model_selection$train_test_split
data <- sk$datasets$load_iris
iris <- data()
X = iris$data
y = iris$target
data_split = train_test_split(X, y, test_size=0.33)
X_tr <- data_split[[1]]
X_te <- data_split[[2]]
y_tr <- data_split[[3]]
y_te <- data_split[[4]]
gcforest_m <- gcforest(shape_1X=4L, window=2L, tolerance=0.0)
gcforest_m$fit(X_tr, y_tr)
pred_X = gcforest_m$predict(X_te)
print(pred_X)
}else{
print('You should have the Python testing environment!')
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.