datagrid | R Documentation |
Generate a data grid of user-specified values for use in the newdata
argument of the predictions()
, comparisons()
, and slopes()
functions. This is useful to define where in the predictor space we want to evaluate the quantities of interest. Ex: the predicted outcome or slope for a 37 year old college graduate.
datagrid(
...,
model = NULL,
newdata = NULL,
by = NULL,
grid_type = "mean_or_mode",
response = FALSE,
FUN_character = NULL,
FUN_factor = NULL,
FUN_logical = NULL,
FUN_numeric = NULL,
FUN_integer = NULL,
FUN_binary = NULL,
FUN_other = NULL
)
... |
named arguments with vectors of values or functions for user-specified variables.
|
model |
Model object |
newdata |
data.frame (one and only one of the |
by |
character vector with grouping variables within which |
grid_type |
character. Determines the functions to apply to each variable. The defaults can be overridden by defining individual variables explicitly in
|
response |
Logical should the response variable be included in the grid, even if it is not specified explicitly. |
FUN_character |
the function to be applied to character variables. |
FUN_factor |
the function to be applied to factor variables. This only applies if the variable in the original data is a factor. For variables converted to factor in a model-fitting formula, for example, |
FUN_logical |
the function to be applied to logical variables. |
FUN_numeric |
the function to be applied to numeric variables. |
FUN_integer |
the function to be applied to integer variables. |
FUN_binary |
the function to be applied to binary variables. |
FUN_other |
the function to be applied to other variable types. |
If datagrid
is used in a predictions()
, comparisons()
, or slopes()
call as the
newdata
argument, the model is automatically inserted in the model
argument of datagrid()
call, and users do not need to specify either the model
or newdata
arguments. The same behavior will occur when the value supplied to newdata=
is a function call which starts with "datagrid". This is intended to allow users to create convenience shortcuts like:
library(marginaleffects) mod <- lm(mpg ~ am + vs + factor(cyl) + hp, mtcars) datagrid_bal <- function(...) datagrid(..., grid_type = "balanced") predictions(model, newdata = datagrid_bal(cyl = 4))
If users supply a model, the data used to fit that model is retrieved using
the insight::get_data
function.
A data.frame
in which each row corresponds to one combination of the named
predictors supplied by the user via the ...
dots. Variables which are not
explicitly defined are held at their mean or mode.
# The output only has 2 rows, and all the variables except `hp` are at their
# mean or mode.
datagrid(newdata = mtcars, hp = c(100, 110))
# We get the same result by feeding a model instead of a data.frame
mod <- lm(mpg ~ hp, mtcars)
datagrid(model = mod, hp = c(100, 110))
# Use in `marginaleffects` to compute "Typical Marginal Effects". When used
# in `slopes()` or `predictions()` we do not need to specify the
#`model` or `newdata` arguments.
slopes(mod, newdata = datagrid(hp = c(100, 110)))
# datagrid accepts functions
datagrid(hp = range, cyl = unique, newdata = mtcars)
comparisons(mod, newdata = datagrid(hp = fivenum))
# The full dataset is duplicated with each observation given counterfactual
# values of 100 and 110 for the `hp` variable. The original `mtcars` includes
# 32 rows, so the resulting dataset includes 64 rows.
dg <- datagrid(newdata = mtcars, hp = c(100, 110), grid_type = "counterfactual")
nrow(dg)
# We get the same result by feeding a model instead of a data.frame
mod <- lm(mpg ~ hp, mtcars)
dg <- datagrid(model = mod, hp = c(100, 110), grid_type = "counterfactual")
nrow(dg)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.