predict.model_fit | R Documentation |
Apply a model to create different types of predictions.
predict()
can be used for all types of models and uses the
"type" argument for more specificity.
## S3 method for class 'model_fit'
predict(object, new_data, type = NULL, opts = list(), ...)
## S3 method for class 'model_fit'
predict_raw(object, new_data, opts = list(), ...)
predict_raw(object, ...)
object |
An object of class |
new_data |
A rectangular data object, such as a data frame. |
type |
A single character value or |
opts |
A list of optional arguments to the underlying
predict function that will be used when |
... |
Additional
|
For type = NULL
, predict()
uses
type = "numeric"
for regression models,
type = "class"
for classification, and
type = "time"
for censored regression.
When using type = "conf_int"
and type = "pred_int"
, the options
level
and std_error
can be used. The latter is a logical for an
extra column of standard error values (if available).
For censored regression, a numeric vector for eval_time
is required when
survival or hazard probabilities are requested. The time values are required
to be unique, finite, non-missing, and non-negative. The predict()
functions will adjust the values to fit this specification by removing
offending points (with a warning).
predict.model_fit()
does not require the outcome to be present. For
performance metrics on the predicted survival probability, inverse probability
of censoring weights (IPCW) are required (see the tidymodels.org
reference
below). Those require the outcome and are thus not returned by predict()
.
They can be added via augment.model_fit()
if new_data
contains a column
with the outcome as a Surv
object.
Also, when type = "linear_pred"
, censored regression models will by default
be formatted such that the linear predictor increases with time. This may
have the opposite sign as what the underlying model's predict()
method
produces. Set increasing = FALSE
to suppress this behavior.
With the exception of type = "raw"
, the result of
predict.model_fit()
is a tibble
has as many rows as there are rows in new_data
has standardized column names, see below:
For type = "numeric"
, the tibble has a .pred
column for a single
outcome and .pred_Yname
columns for a multivariate outcome.
For type = "class"
, the tibble has a .pred_class
column.
For type = "prob"
, the tibble has .pred_classlevel
columns.
For type = "conf_int"
and type = "pred_int"
, the tibble has
.pred_lower
and .pred_upper
columns with an attribute for
the confidence level. In the case where intervals can be
produces for class probabilities (or other non-scalar outputs),
the columns are named .pred_lower_classlevel
and so on.
For type = "quantile"
, the tibble has a .pred
column, which is
a list-column. Each list element contains a tibble with columns
.pred
and .quantile
(and perhaps other columns).
For type = "time"
, the tibble has a .pred_time
column.
For type = "survival"
, the tibble has a .pred
column, which is
a list-column. Each list element contains a tibble with columns
.eval_time
and .pred_survival
(and perhaps other columns).
For type = "hazard"
, the tibble has a .pred
column, which is
a list-column. Each list element contains a tibble with columns
.eval_time
and .pred_hazard
(and perhaps other columns).
Using type = "raw"
with predict.model_fit()
will return
the unadulterated results of the prediction function.
In the case of Spark-based models, since table columns cannot contain dots, the same convention is used except 1) no dots appear in names and 2) vectors are never returned but type-specific prediction functions.
When the model fit failed and the error was captured, the
predict()
function will return the same structure as above but
filled with missing values. This does not currently work for
multivariate models.
https://www.tidymodels.org/learn/statistics/survival-metrics/
library(dplyr)
lm_model <-
linear_reg() %>%
set_engine("lm") %>%
fit(mpg ~ ., data = mtcars %>% dplyr::slice(11:32))
pred_cars <-
mtcars %>%
dplyr::slice(1:10) %>%
dplyr::select(-mpg)
predict(lm_model, pred_cars)
predict(
lm_model,
pred_cars,
type = "conf_int",
level = 0.90
)
predict(
lm_model,
pred_cars,
type = "raw",
opts = list(type = "terms")
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.