Description Usage Arguments Value Examples
Train a random forest model for classification or regression tasks.
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | cuda_ml_rand_forest(x, ...)
## Default S3 method:
cuda_ml_rand_forest(x, ...)
## S3 method for class 'data.frame'
cuda_ml_rand_forest(
  x,
  y,
  mtry = NULL,
  trees = NULL,
  min_n = 2L,
  bootstrap = TRUE,
  max_depth = 16L,
  max_leaves = Inf,
  max_predictors_per_note_split = NULL,
  n_bins = 128L,
  min_samples_leaf = 1L,
  split_criterion = NULL,
  min_impurity_decrease = 0,
  max_batch_size = 128L,
  n_streams = 8L,
  cuML_log_level = c("off", "critical", "error", "warn", "info", "debug", "trace"),
  ...
)
## S3 method for class 'matrix'
cuda_ml_rand_forest(
  x,
  y,
  mtry = NULL,
  trees = NULL,
  min_n = 2L,
  bootstrap = TRUE,
  max_depth = 16L,
  max_leaves = Inf,
  max_predictors_per_note_split = NULL,
  n_bins = 128L,
  min_samples_leaf = 1L,
  split_criterion = NULL,
  min_impurity_decrease = 0,
  max_batch_size = 128L,
  n_streams = 8L,
  cuML_log_level = c("off", "critical", "error", "warn", "info", "debug", "trace"),
  ...
)
## S3 method for class 'formula'
cuda_ml_rand_forest(
  formula,
  data,
  mtry = NULL,
  trees = NULL,
  min_n = 2L,
  bootstrap = TRUE,
  max_depth = 16L,
  max_leaves = Inf,
  max_predictors_per_note_split = NULL,
  n_bins = 128L,
  min_samples_leaf = 1L,
  split_criterion = NULL,
  min_impurity_decrease = 0,
  max_batch_size = 128L,
  n_streams = 8L,
  cuML_log_level = c("off", "critical", "error", "warn", "info", "debug", "trace"),
  ...
)
## S3 method for class 'recipe'
cuda_ml_rand_forest(
  x,
  data,
  mtry = NULL,
  trees = NULL,
  min_n = 2L,
  bootstrap = TRUE,
  max_depth = 16L,
  max_leaves = Inf,
  max_predictors_per_note_split = NULL,
  n_bins = 128L,
  min_samples_leaf = 1L,
  split_criterion = NULL,
  min_impurity_decrease = 0,
  max_batch_size = 128L,
  n_streams = 8L,
  cuML_log_level = c("off", "critical", "error", "warn", "info", "debug", "trace"),
  ...
)
 | 
| x | Depending on the context: * A __data frame__ of predictors. * A __matrix__ of predictors. * A __recipe__ specifying a set of preprocessing steps * created from [recipes::recipe()]. * A __formula__ specifying the predictors and the outcome. | 
| ... | Optional arguments; currently unused. | 
| y | A numeric vector (for regression) or factor (for classification) of desired responses. | 
| mtry | The number of predictors that will be randomly sampled at each split when creating the tree models. Default: the square root of the total number of predictors. | 
| trees | An integer for the number of trees contained in the ensemble. Default: 100L. | 
| min_n | An integer for the minimum number of data points in a node that are required for the node to be split further. Default: 2L. | 
| bootstrap | Whether to perform bootstrap. If TRUE, each tree in the forest is built on a bootstrapped sample with replacement. If FALSE, the whole dataset is used to build each tree. | 
| max_depth | Maximum tree depth. Default: 16L. | 
| max_leaves | Maximum leaf nodes per tree. Soft constraint. Default: Inf (unlimited). | 
| max_predictors_per_note_split | Number of predictor to consider per node split. Default: square root of the total number predictors. | 
| n_bins | Number of bins used by the split algorithm. Default: 128L. | 
| min_samples_leaf | The minimum number of data points in each leaf node. Default: 1L. | 
| split_criterion | The criterion used to split nodes, can be "gini" or "entropy" for classifications, and "mse" or "mae" for regressions. Default: "gini" for classification; "mse" for regression. | 
| min_impurity_decrease | Minimum decrease in impurity requried for node to be spilt. Default: 0. | 
| max_batch_size | Maximum number of nodes that can be processed in a given batch. Default: 128L. | 
| n_streams | Number of CUDA streams to use for building trees. Default: 8L. | 
| cuML_log_level | Log level within cuML library functions. Must be one of "off", "critical", "error", "warn", "info", "debug", "trace". Default: off. | 
| formula | A formula specifying the outcome terms on the left-hand side, and the predictor terms on the right-hand side. | 
| data | When a __recipe__ or __formula__ is used,  | 
A random forest classifier / regressor object that can be used with the 'predict' S3 generic to make predictions on new data points.
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | library(cuda.ml)
# Classification
model <- cuda_ml_rand_forest(
  formula = Species ~ .,
  data = iris,
  trees = 100
)
predictions <- predict(model, iris[names(iris) != "Species"])
# Regression
model <- cuda_ml_rand_forest(
  formula = mpg ~ .,
  data = mtcars,
  trees = 100
)
predictions <- predict(model, mtcars[names(mtcars) != "mpg"])
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.