get_datagrid | R Documentation |
Create a reference matrix, useful for visualisation, with evenly spread and
combined values. Usually used to generate predictions using get_predicted()
.
See this
vignette
for a tutorial on how to create a visualisation matrix using this function.
Alternatively, these can also be used to extract the "grid" columns from objects generated by emmeans and marginaleffects (see those methods for more info).
get_datagrid(x, ...)
## S3 method for class 'data.frame'
get_datagrid(
x,
by = "all",
factors = "reference",
numerics = "mean",
length = 10,
range = "range",
preserve_range = FALSE,
protect_integers = TRUE,
digits = 3,
reference = x,
...
)
## S3 method for class 'numeric'
get_datagrid(
x,
length = 10,
range = "range",
protect_integers = TRUE,
digits = 3,
...
)
## S3 method for class 'factor'
get_datagrid(x, ...)
## Default S3 method:
get_datagrid(
x,
by = "all",
factors = "reference",
numerics = "mean",
preserve_range = TRUE,
reference = x,
include_smooth = TRUE,
include_random = FALSE,
include_response = FALSE,
data = NULL,
digits = 3,
verbose = TRUE,
...
)
x |
An object from which to construct the reference grid. |
... |
Arguments passed to or from other methods (for instance, |
by |
Indicates the focal predictors (variables) for the reference grid
and at which values focal predictors should be represented. If not specified
otherwise, representative values for numeric variables or predictors are
evenly distributed from the minimum to the maximum, with a total number of
There is a special handling of assignments with brackets, i.e. values
defined inside
Note: the The remaining variables not specified in |
factors |
Type of summary for factors not specified in |
numerics |
Type of summary for numeric values not specified in |
length |
Length of numeric target variables selected in In case of multiple continuous target variables, When
|
range |
Option to control the representative values given in
|
preserve_range |
In the case of combinations between numeric variables
and factors, setting |
protect_integers |
Defaults to |
digits |
Number of digits used for rounding numeric values specified in
|
reference |
The reference vector from which to compute the mean and SD.
Used when standardizing or unstandardizing the grid using |
include_smooth |
If |
include_random |
If |
include_response |
If |
data |
Optional, the data frame that was used to fit the model. Usually,
the data is retrieved via |
verbose |
Toggle warnings. |
Data grids are an (artificial or theoretical) representation of the sample.
They consists of predictors of interest (so-called focal predictors), and
meaningful values, at which the sample characteristics (focal predictors)
should be represented. The focal predictors are selected in by
. To select
meaningful (or representative) values, either use by
, or use a combination
of the arguments length
and range
.
Reference grid data frame.
get_predicted()
to extract predictions, for which the data grid
is useful, and see the methods for objects generated
by emmeans and marginaleffects to extract the "grid" columns.
# Datagrids of variables and dataframes =====================================
data(iris)
data(mtcars)
# Single variable is of interest; all others are "fixed" ------------------
# Factors, returns all the levels
get_datagrid(iris, by = "Species")
# Specify an expression
get_datagrid(iris, by = "Species = c('setosa', 'versicolor')")
# Numeric variables, default spread length = 10
get_datagrid(iris, by = "Sepal.Length")
# change length
get_datagrid(iris, by = "Sepal.Length", length = 3)
# change non-targets fixing
get_datagrid(iris[2:150, ],
by = "Sepal.Length",
factors = "mode", numerics = "median"
)
# change min/max of target
get_datagrid(iris, by = "Sepal.Length", range = "ci", ci = 0.90)
# Manually change min/max
get_datagrid(iris, by = "Sepal.Length = c(0, 1)")
# -1 SD, mean and +1 SD
get_datagrid(iris, by = "Sepal.Length = [sd]")
# rounded to 1 digit
get_datagrid(iris, by = "Sepal.Length = [sd]", digits = 1)
# identical to previous line: -1 SD, mean and +1 SD
get_datagrid(iris, by = "Sepal.Length", range = "sd", length = 3)
# quartiles
get_datagrid(iris, by = "Sepal.Length = [quartiles]")
# Standardization and unstandardization
data <- get_datagrid(iris, by = "Sepal.Length", range = "sd", length = 3)
# It is a named vector (extract names with `names(out$Sepal.Length)`)
data$Sepal.Length
datawizard::standardize(data, select = "Sepal.Length")
# Manually specify values
data <- get_datagrid(iris, by = "Sepal.Length = c(-2, 0, 2)")
data
datawizard::unstandardize(data, select = "Sepal.Length")
# Multiple variables are of interest, creating a combination --------------
get_datagrid(iris, by = c("Sepal.Length", "Species"), length = 3)
get_datagrid(iris, by = c("Sepal.Length", "Petal.Length"), length = c(3, 2))
get_datagrid(iris, by = c(1, 3), length = 3)
get_datagrid(iris, by = c("Sepal.Length", "Species"), preserve_range = TRUE)
get_datagrid(iris, by = c("Sepal.Length", "Species"), numerics = 0)
get_datagrid(iris, by = c("Sepal.Length = 3", "Species"))
get_datagrid(iris, by = c("Sepal.Length = c(3, 1)", "Species = 'setosa'"))
# specify length individually for each focal predictor
# values are matched by names
get_datagrid(mtcars[1:4], by = c("mpg", "hp"), length = c(hp = 3, mpg = 2))
# Numeric and categorical variables, generating a grid for plots
# default spread when numerics are first: length = 10
get_datagrid(iris, by = c("Sepal.Length", "Species"), range = "grid")
# default spread when numerics are not first: length = 3 (-1 SD, mean and +1 SD)
get_datagrid(iris, by = c("Species", "Sepal.Length"), range = "grid")
# range of values
get_datagrid(iris, by = c("Sepal.Width = 1:5", "Petal.Width = 1:3"))
# With list-style by-argument
get_datagrid(
iris,
by = list(Sepal.Length = 1:3, Species = c("setosa", "versicolor"))
)
# With models ===============================================================
# Fit a linear regression
model <- lm(Sepal.Length ~ Sepal.Width * Petal.Length, data = iris)
# Get datagrid of predictors
data <- get_datagrid(model, length = c(20, 3), range = c("range", "sd"))
# same as: get_datagrid(model, range = "grid", length = 20)
# Add predictions
data$Sepal.Length <- get_predicted(model, data = data)
# Visualize relationships (each color is at -1 SD, Mean, and + 1 SD of Petal.Length)
plot(data$Sepal.Width, data$Sepal.Length,
col = data$Petal.Length,
main = "Relationship at -1 SD, Mean, and + 1 SD of Petal.Length"
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.