| resolve_x_cols | R Documentation |
Resolve x_cols and exclude_cols to their standardized format of x_cols to specify which 1D and 2D ALE elements are required. This specification is used throughout the ALE package. x_cols specifies the desired columns or interactions whereas exclude_cols optionally specifies any columns or interactions to remove from x_cols. The result is x_cols – exclude_cols, giving considerable flexibility in specifying the precise columns desired.
resolve_x_cols(x_cols, col_names, y_col, exclude_cols = NULL, silent = FALSE)
x_cols |
character, list, or formula. Columns and interactions requested in one of the special |
col_names |
character. All the column names from a dataset. All values in |
y_col |
character(1). The y outcome column. If found in any |
exclude_cols |
Same possible formats as |
silent |
logical(1). If |
x_cols in canonical format, which is always a list with two elements, d1 and d2. Each element is a character vector with each requested column for 1D ALE (d1) or 2D ALE interaction pair (d2). If either dimension is empty, its value is an empty character, character().
See examples for details.
x_cols format optionsThe x_cols argument determines which predictor variables and interactions are included in the analysis. It supports multiple input formats:
Character vector: Users can explicitly specify 1D terms and 2D ALE interactions, e.g., c("a", "b", "a:b", "a:c").
Formula (~): Allows specifying variables and interactions in
formula notation (e.g., ~ a + b + a:b), which is automatically converted
into a structured format. The outcome term is optional and will be ignored regardless.
So, ~ a + b + a:b produces results identical to whatever ~ a + b + a:b.
List format:
The basic list format is a list of character vectors named d1 for 1D ALE terms, d2 for 2D interactions, or both. For example, list(d1 = c("a", "b"), d2 = c("a:b", "a:c"))
Boolean selection for an entire dimension:
list(d1 = TRUE) selects all available variables for 1D ALE, excluding y_col.
list(d2 = TRUE) selects all possible 2D interactions among all columns in col_names, excluding y_col.
A character vector of 1D terms only named d2_all may be used to include all 2D interactions that include the specified 1D terms. For example, specifying list(d2_all = "a") would select c("a:b", "a:c", "a:d"), etc. This is in addition to any terms requested in the d1 or d2 elements.
NULL (or unspecified): If x_cols = NULL, no variables are selected.
The function ensures all variables are valid and in col_names, providing informative messages unless silent = TRUE. And regardless of the specification format, the result will always be standardized in the format specified in the return value. Note that y_col is not removed if included in x_cols. However, a message alerts when it is included, in case it is a mistake.
Run examples for details.
## Sample data
set.seed(0)
df <- data.frame(
y = runif(10),
a = sample(letters[1:3], 10, replace = TRUE),
b = rnorm(10),
c = sample(1:5, 10, replace = TRUE)
)
col_names <- names(df)
y_col <- "y" # Assume 'y' is the outcome variable
## Examples with just x_cols to show different formats for specifying x_cols
## (same format for exclude_cols)
# Character vector: Simple ALE with no interactions
resolve_x_cols(c("a", "b"), col_names, y_col)
# Character string: Select just one 1D element
resolve_x_cols("c", col_names, y_col)
# list of 1- and 2-length character vectors: specify precise 1D and 2D elements desired
resolve_x_cols(c('a:b', "c", 'c:a', "b"), col_names, y_col)
# Formula: Converts to a list of individual elements
resolve_x_cols(~ a + b, col_names, y_col)
# Formula with interactions (1D and 2D).
# This format is probably more convenient if you know precisely which terms you want.
# Note that the outcome on the left-hand-side is always silently ignored.
resolve_x_cols(whatever ~ a + b + a:b + c:b, col_names, y_col)
# List specifying d1 (1D ALE)
resolve_x_cols(list(d1 = c("a", "b")), col_names, y_col)
# List specifying d2 (2D ALE)
resolve_x_cols(list(d2 = 'a:b'), col_names, y_col)
# List specifying both d1 and d2
resolve_x_cols(list(d1 = c("a", "b"), d2 = 'a:b'), col_names, y_col)
# d1 as TRUE (select all columns except y_col)
resolve_x_cols(list(d1 = TRUE), col_names, y_col)
# d2 as TRUE (select all possible 2D interactions)
resolve_x_cols(list(d2 = TRUE), col_names, y_col)
# d2_all: Request all 2D interactions involving a specific variable
resolve_x_cols(list(d2_all = "a"), col_names, y_col)
# NULL: No variables selected
resolve_x_cols(NULL, col_names, y_col)
## Examples of how exclude_cols are removed from x_cols to obtain various desired results
# Exclude one column from a simple character vector
resolve_x_cols(
x_cols = c("a", "b", "c"),
col_names = col_names,
y_col = y_col,
exclude_cols = "b"
)
# Exclude multiple columns
resolve_x_cols(
x_cols = c("a", "b", "c"),
col_names = col_names,
y_col = y_col,
exclude_cols = c("a", "c")
)
# Exclude an interaction term from a formula input
resolve_x_cols(
x_cols = ~ a + b + a:b,
col_names = col_names,
y_col = y_col,
exclude_cols = ~ a:b
)
# Exclude all columns from x_cols
resolve_x_cols(
x_cols = c("a", "b", "c"),
col_names = col_names,
y_col = y_col,
exclude_cols = c("a", "b", "c")
)
# Exclude non-existent columns (should be ignored)
resolve_x_cols(
x_cols = c("a", "b"),
col_names = col_names,
y_col = y_col,
exclude_cols = "z"
)
# Exclude one column from a list-based input
resolve_x_cols(
x_cols = list(d1 = c("a", "b"), d2 = c("a:b", "a:c")),
col_names = col_names,
y_col = y_col,
exclude_cols = list(d1 = "a")
)
# Exclude interactions only
resolve_x_cols(
x_cols = list(d1 = c("a", "b", "c"), d2 = c("a:b", "a:c")),
col_names = col_names,
y_col = y_col,
exclude_cols = list(d2 = 'a:b')
)
# Exclude everything, including interactions
resolve_x_cols(
x_cols = list(d1 = c("a", "b", "c"), d2 = c("a:b", "a:c")),
col_names = col_names,
y_col = y_col,
exclude_cols = list(d1 = c("a", "b", "c"), d2 = c("a:b", "a:c"))
)
# Exclude a column implicitly removed by y_col
resolve_x_cols(
x_cols = c("y", "a", "b"),
col_names = col_names,
y_col = "y",
exclude_cols = "a"
)
# Exclude entire 2D dimension from x_cols with d2 = TRUE
resolve_x_cols(
x_cols = list(d1 = TRUE, d2 = c("a:b", "a:c")),
col_names = col_names,
y_col = y_col,
exclude_cols = list(d1 = c("a"), d2 = TRUE)
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.