nano_residuals | R Documentation |
Predict on data from fitted model and plots the actual vs predicted.
nano_residuals(
nano,
data = NA,
model_no = NA,
train_test = "data_id",
group = NA,
size = NA,
save = TRUE
)
nano |
a nano object containing the fitted models. |
data |
a list of datasets. If the underlying dataset is the same for each model, can only input a list with a single element. |
model_no |
the positions of each model in the list of models in the nano object for which the PDP should be calculated. If not entered, the last model is taken by default. |
train_test |
a character. Variable in |
group |
a character variable in |
size |
a character variable in |
save |
a logical specifying whether to save the output to the nano object (if |
Functions checks whether the data contains the train_test
column. If it does then the
actual vs predicted is calculated for each split specified in the train-test
column. Otherwise, the actual
vs predict is calculated based on the total data.
If the plot is desired to be performed on a subset of the data (e.g. to see performance of the model on
a specific part of the data) then the data
argument can be used to supply the data subseted
in the desired manner. If the data
argument is not used, then by default the data used to
train the model is used by the function.
if save = TRUE
then returns nano object with the actual vs predicted for the specified models.
If save = FALSE
then returns a list with the actual vs predicted for the specified models.
## Not run:
if(interactive()){
library(h2o)
library(nano)
h2o.init()
# import dataset
data(property_prices)
train <- as.h2o(property_prices)
# set the response and predictors
response <- "sale_price"
var <- setdiff(colnames(property_prices), response)
# build grids
grid_1 <- h2o.grid(x = var,
y = response,
training_frame = train,
algorithm = "randomForest",
hyper_params = list(ntrees = 1:2),
nfolds = 3,
seed = 628)
grid_2 <- h2o.grid(x = var,
y = response,
training_frame = train,
algorithm = "randomForest",
hyper_params = list(ntrees = 3:4),
nfolds = 3,
seed = 628)
obj <- create_nano(grid = list(grid_1, grid_2),
data = list(property_prices), # since underlying dataset is the same
) # since model is not entered, will take best model from grids
# score on both models
obj <- nano_residuals(nano = obj, model_no = 1:2, save = TRUE)
}
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.