nano_scoring | R Documentation |
Predict on data from fitted model and compares the mean prediction with the mean response by the inputted percentiles.
nano_scoring(
nano,
data = NA,
model_no = NA,
percentiles,
train_test = "data_id",
save = TRUE
)
nano |
a nano object containing the fitted models. |
data |
a list of datasets. If the underlying dataset is the same for each model, can only input a list with a single element. |
model_no |
the positions of each model in the list of models in the nano object for which the PDP should be calculated. If not entered, the last model is taken by default. |
train_test |
a character. Variable in |
save |
a logical specifying whether to save the output to the nano object (if |
Functions checks whether the data contains the train_test
column. If it does then
scoring is done for each split specified in the train-test
column. Otherwise, the scoring
is done on the total data.
If desire to perform scoring on a subset of the data (e.g. to see performance of the model on
a specific part of the data) then the data
argument can be used to supply the data subseted
in the desired manner. If the data
argument is not used, then by default the data used to
train the model is used by the function.
if save = TRUE
then returns nano object with the specified models scored. If
save = FALSE
then returns a list with the specified models scored.
## Not run:
if(interactive()){
library(h2o)
library(nano)
h2o.init()
# import dataset
data(property_prices)
train <- as.h2o(property_prices)
# set the response and predictors
response <- "sale_price"
var <- setdiff(colnames(property_prices), response)
# build grids
grid_1 <- h2o.grid(x = var,
y = response,
training_frame = train,
algorithm = "randomForest",
hyper_params = list(ntrees = 1:2),
nfolds = 3,
seed = 628)
grid_2 <- h2o.grid(x = var,
y = response,
training_frame = train,
algorithm = "randomForest",
hyper_params = list(ntrees = 3:4),
nfolds = 3,
seed = 628)
obj <- create_nano(grid = list(grid_1, grid_2),
data = list(property_prices), # since underlying dataset is the same
) # since model is not entered, will take best model from grids
# score on both models
obj <- nano_scoring(nano = obj, model_no = 1:2, percentiles = seq(0, 1, 0.02), save = TRUE)
}
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.