View source: R/get_predicted.R
get_predicted | R Documentation |
The get_predicted()
function is a robust, flexible and user-friendly
alternative to base R predict()
function. Additional features and
advantages include availability of uncertainty intervals (CI), bootstrapping,
a more intuitive API and the support of more models than base R's predict()
function. However, although the interface are simplified, it is still very
important to read the documentation of the arguments. This is because making
"predictions" (a lose term for a variety of things) is a non-trivial process,
with lots of caveats and complications. Read the 'Details' section for more
information.
get_predicted_ci()
returns the confidence (or prediction) interval (CI)
associated with predictions made by a model. This function can be called
separately on a vector of predicted values. get_predicted()
usually
returns confidence intervals (included as attribute, and accessible via the
as.data.frame()
method) by default. It is preferred to rely on the
get_predicted()
function for standard errors and confidence intervals -
use get_predicted_ci()
only if standard errors and confidence intervals
are not available otherwise.
get_predicted(x, ...)
## Default S3 method:
get_predicted(
x,
data = NULL,
predict = "expectation",
ci = NULL,
ci_type = "confidence",
ci_method = NULL,
dispersion_method = "sd",
vcov = NULL,
vcov_args = NULL,
verbose = TRUE,
...
)
## S3 method for class 'lm'
get_predicted(
x,
data = NULL,
predict = "expectation",
ci = NULL,
iterations = NULL,
verbose = TRUE,
...
)
## S3 method for class 'stanreg'
get_predicted(
x,
data = NULL,
predict = "expectation",
iterations = NULL,
ci = NULL,
ci_method = NULL,
include_random = "default",
include_smooth = TRUE,
verbose = TRUE,
...
)
## S3 method for class 'gam'
get_predicted(
x,
data = NULL,
predict = "expectation",
ci = NULL,
include_random = TRUE,
include_smooth = TRUE,
iterations = NULL,
verbose = TRUE,
...
)
## S3 method for class 'lmerMod'
get_predicted(
x,
data = NULL,
predict = "expectation",
ci = NULL,
ci_method = NULL,
include_random = "default",
iterations = NULL,
verbose = TRUE,
...
)
## S3 method for class 'principal'
get_predicted(x, data = NULL, ...)
x |
A statistical model (can also be a data.frame, in which case the second argument has to be a model). |
... |
Other argument to be passed, for instance to the model's |
data |
An optional data frame in which to look for variables with which
to predict. If omitted, the data used to fit the model is used. Visualization
matrices can be generated using |
predict |
string or
|
ci |
The interval level. Default is |
ci_type |
Can be |
ci_method |
The method for computing p values and confidence intervals. Possible values depend on model type.
See |
dispersion_method |
Bootstrap dispersion and Bayesian posterior summary:
|
vcov |
Variance-covariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.
One exception are models of class |
vcov_args |
List of arguments to be passed to the function identified by
the |
verbose |
Toggle warnings. |
iterations |
For Bayesian models, this corresponds to the number of
posterior draws. If |
include_random |
If |
include_smooth |
For General Additive Models (GAMs). If |
In insight::get_predicted()
, the predict
argument jointly modulates two
separate concepts, the scale and the uncertainty interval.
The fitted values (i.e. predictions for the response). For Bayesian
or bootstrapped models (when iterations != NULL
), iterations (as columns
and observations are rows) can be accessed via as.data.frame()
.
Linear models - lm()
: For linear models, prediction intervals
(predict="prediction"
) show the range that likely contains the value of a
new observation (in what range it is likely to fall), whereas confidence
intervals (predict="expectation"
or predict="link"
) reflect the
uncertainty around the estimated parameters (and gives the range of
uncertainty of the regression line). In general, Prediction Intervals (PIs)
account for both the uncertainty in the model's parameters, plus the random
variation of the individual values. Thus, prediction intervals are always
wider than confidence intervals. Moreover, prediction intervals will not
necessarily become narrower as the sample size increases (as they do not
reflect only the quality of the fit, but also the variability within the
data).
Generalized Linear models - glm()
: For binomial models, prediction
intervals are somewhat useless (for instance, for a binomial (Bernoulli)
model for which the dependent variable is a vector of 1s and 0s, the
prediction interval is... [0, 1]
).
When users set the predict
argument to "expectation"
, the predictions are
returned on the response scale, which is arguably the most convenient way to
understand and visualize relationships of interest. When users set the
predict
argument to "link"
, predictions are returned on the link scale,
and no transformation is applied. For instance, for a logistic regression
model, the response scale corresponds to the predicted probabilities, whereas
the link-scale makes predictions of log-odds (probabilities on the logit
scale). Note that when users select predict = "classification"
in binomial
models, the get_predicted()
function will first calculate predictions as if
the user had selected predict = "expectation"
. Then, it will round the
responses in order to return the most likely outcome. For ordinal or mixture
models, it returns the predicted class membership, based on the highest
probability of classification.
The arguments vcov
and vcov_args
can be used to calculate robust standard
errors for confidence intervals of predictions. These arguments, when
provided in get_predicted()
, are passed down to get_predicted_ci()
, thus,
see the related documentation there for more details.
For finite mixture models (currently, only the mixture()
family from package
brms is supported), use predict = "classification"
to predict the class
membership. To predict outcome values by class, use predict = "link"
. Other
predict
options will return predicted values of the outcome for the full
data, not stratified by class membership.
For predictions based on multiple iterations, for instance in the case of
Bayesian models and bootstrapped predictions, the function used to compute
the centrality (point-estimate predictions) can be modified via the
centrality_function
argument. For instance,
get_predicted(model, centrality_function = stats::median)
. The default is
mean
. Individual draws can be accessed by running
iter <- as.data.frame(get_predicted(model))
, and their iterations can be
reshaped into a long format by bayestestR::reshape_iterations(iter)
.
There is limited support for hypothesis tests, i.e. objects of class htest
:
chisq.test()
: returns the expected values of the contingency table.
get_datagrid()
data(mtcars)
x <- lm(mpg ~ cyl + hp, data = mtcars)
predictions <- get_predicted(x, ci = 0.95)
predictions
# Options and methods ---------------------
get_predicted(x, predict = "prediction")
# Get CI
as.data.frame(predictions)
# Bootstrapped
as.data.frame(get_predicted(x, iterations = 4))
# Same as as.data.frame(..., keep_iterations = FALSE)
summary(get_predicted(x, iterations = 4))
# Different prediction types ------------------------
data(iris)
data <- droplevels(iris[1:100, ])
# Fit a logistic model
x <- glm(Species ~ Sepal.Length, data = data, family = "binomial")
# Expectation (default): response scale + CI
pred <- get_predicted(x, predict = "expectation", ci = 0.95)
head(as.data.frame(pred))
# Prediction: response scale + PI
pred <- get_predicted(x, predict = "prediction", ci = 0.95)
head(as.data.frame(pred))
# Link: link scale + CI
pred <- get_predicted(x, predict = "link", ci = 0.95)
head(as.data.frame(pred))
# Classification: classification "type" + PI
pred <- get_predicted(x, predict = "classification", ci = 0.95)
head(as.data.frame(pred))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.