s_LightRF | R Documentation |
Random Forest using LightGBM
s_LightRF(
x,
y = NULL,
x.test = NULL,
y.test = NULL,
x.name = NULL,
y.name = NULL,
weights = NULL,
ifw = TRUE,
ifw.type = 2,
upsample = FALSE,
downsample = FALSE,
resample.seed = NULL,
objective = NULL,
nrounds = 500L,
early_stopping_rounds = -1L,
num_leaves = 4096L,
max_depth = -1L,
learning_rate = 1,
feature_fraction = 1,
subsample = 0.623,
subsample_freq = 1L,
lambda_l1 = 0,
lambda_l2 = 0,
max_cat_threshold = 32L,
min_data_per_group = 32L,
linear_tree = FALSE,
tree_learner = "data_parallel",
grid.resample.params = setup.resample("kfold", 5),
gridsearch.type = "exhaustive",
metric = NULL,
maximize = NULL,
importance = TRUE,
print.plot = FALSE,
plot.fitted = NULL,
plot.predicted = NULL,
plot.theme = rtTheme,
question = NULL,
verbose = TRUE,
grid.verbose = FALSE,
lightgbm_verbose = -1,
save.gridrun = FALSE,
n.cores = 1,
n_threads = rtCores,
force_col_wise = FALSE,
force_row_wise = FALSE,
outdir = NULL,
save.mod = ifelse(!is.null(outdir), TRUE, FALSE),
.gs = FALSE,
...
)
x |
Numeric vector or matrix / data frame of features i.e. independent variables |
y |
Numeric vector of outcome, i.e. dependent variable |
x.test |
Numeric vector or matrix / data frame of testing set features
Columns must correspond to columns in |
y.test |
Numeric vector of testing set outcome |
x.name |
Character: Name for feature set |
y.name |
Character: Name for outcome |
weights |
Numeric vector: Weights for cases. For classification, |
ifw |
Logical: If TRUE, apply inverse frequency weighting
(for Classification only).
Note: If |
ifw.type |
Integer 0, 1, 2 1: class.weights as in 0, divided by min(class.weights) 2: class.weights as in 0, divided by max(class.weights) |
upsample |
Logical: If TRUE, upsample cases to balance outcome classes (for Classification only) Note: upsample will randomly sample with replacement if the length of the majority class is more than double the length of the class you are upsampling, thereby introducing randomness |
downsample |
Logical: If TRUE, downsample majority class to match size of minority class |
resample.seed |
Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed) |
objective |
(Default = NULL) |
nrounds |
Integer: Number of trees to grow |
early_stopping_rounds |
Integer: Training on resamples of |
num_leaves |
Integer: [gS] Maximum tree leaves for base learners. |
max_depth |
Integer: [gS] Maximum tree depth for base learners, <=0 means no limit. |
learning_rate |
Numeric: [gS] Boosting learning rate |
feature_fraction |
Numeric (0, 1): [gS] Fraction of features to consider at each iteration (i.e. tree) |
subsample |
Numeric: [gS] Subsample ratio of the training set. |
subsample_freq |
Integer: Subsample every this many iterations |
lambda_l1 |
Numeric: [gS] L1 regularization term |
lambda_l2 |
Numeric: [gS] L2 regularization term |
max_cat_threshold |
Integer: Max number of splits to consider for categorical variable |
min_data_per_group |
Integer: Minimum number of observations per categorical group |
linear_tree |
Logical: [gS] If |
tree_learner |
Character: [gS] "serial", "feature", "data", "voting" |
grid.resample.params |
List: Output of setup.resample defining grid search parameters. |
gridsearch.type |
Character: Type of grid search to perform: "exhaustive" or "randomized". |
metric |
Character: Metric to minimize, or maximize if
|
maximize |
Logical: If TRUE, |
importance |
Logical: If |
print.plot |
Logical: if TRUE, produce plot using |
plot.fitted |
Logical: if TRUE, plot True (y) vs Fitted |
plot.predicted |
Logical: if TRUE, plot True (y.test) vs Predicted.
Requires |
plot.theme |
Character: "zero", "dark", "box", "darkbox" |
question |
Character: the question you are attempting to answer with this model, in plain language. |
verbose |
Logical: If TRUE, print summary to screen. |
grid.verbose |
Logical: Passed to |
lightgbm_verbose |
Integer: Passed to |
save.gridrun |
Logical: If |
n.cores |
Integer: Number of cores to use. |
n_threads |
Integer: Number of threads for lightgbm using OpenMP. Only
parallelize resamples using |
force_col_wise |
Logical: If |
force_row_wise |
Logical: If |
outdir |
Path to output directory.
If defined, will save Predicted vs. True plot, if available,
as well as full model output, if |
save.mod |
Logical: If TRUE, save all output to an RDS file in |
.gs |
(Internal use only) |
... |
Extra arguments appended to |
ED Gennatas
## Not run:
x <- rnormmat(500, 10)
y <- x[, 3] + .5 * x[, 5]^2 + rnorm(500)
dat <- data.frame(x, y)
mod <- s_LightRF(dat)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.