dig_grid | R Documentation |
This function creates a grid column names specified
by xvars
and yvars
(see var_grid()
). After that, it enumerates all
conditions created from data in x
(by calling dig()
) and for each such
condition and for each row of the grid of combinations, a user-defined
function f
is executed on each sub-data created from x
by selecting all
rows of x
that satisfy the generated condition and by selecting the
columns in the grid's row.
Function is useful for searching for patterns that are based on the
relationships between pairs of columns, such as in dig_correlations()
.
dig_grid(
x,
f,
condition = where(is.logical),
xvars = where(is.numeric),
yvars = where(is.numeric),
disjoint = var_names(colnames(x)),
allow = "all",
na_rm = FALSE,
type = "crisp",
min_length = 0L,
max_length = Inf,
min_support = 0,
max_support = 1,
max_results = Inf,
verbose = FALSE,
threads = 1L,
error_context = list(arg_x = "x", arg_f = "f", arg_condition = "condition", arg_xvars =
"xvars", arg_yvars = "yvars", arg_disjoint = "disjoint", arg_allow = "allow",
arg_na_rm = "na_rm", arg_type = "type", arg_min_length = "min_length", arg_max_length
= "max_length", arg_min_support = "min_support", arg_max_support = "max_support",
arg_max_results = "max_results", arg_verbose = "verbose", arg_threads = "threads",
call = current_env())
)
x |
a matrix or data frame with data to search in. |
f |
the callback function to be executed for each generated condition.
The arguments of the callback function differ based on the value of the
In all cases, the function must return a list of scalar values, which will be converted into a single row of result of final tibble. |
condition |
a tidyselect expression (see tidyselect syntax) specifying the columns to use as condition predicates. The selected columns must be logical or numeric. If numeric, fuzzy conditions are considered. |
xvars |
a tidyselect expression (see
tidyselect syntax)
specifying the columns of |
yvars |
|
disjoint |
an atomic vector of size equal to the number of columns of |
allow |
a character string specifying which columns are allowed to be
selected by
|
na_rm |
a logical value indicating whether to remove rows with missing
values from sub-data before the callback function |
type |
a character string specifying the type of conditions to be processed.
The |
min_length |
the minimum size (the minimum number of predicates) of the condition to be generated (must be greater or equal to 0). If 0, the empty condition is generated in the first place. |
max_length |
the maximum size (the maximum number of predicates) of the condition to be generated. If equal to Inf, the maximum length of conditions is limited only by the number of available predicates. |
min_support |
the minimum support of a condition to trigger the callback
function for it. The support of the condition is the relative frequency
of the condition in the dataset |
max_support |
the maximum support of a condition to trigger the callback
function for it. See argument |
max_results |
the maximum number of generated conditions to execute the
callback function on. If the number of found conditions exceeds
|
verbose |
a logical scalar indicating whether to print progress messages. |
threads |
the number of threads to use for parallel computation. |
error_context |
a list of details to be used in error messages.
This argument is useful when
|
A tibble with found patterns. Each row represents a single call of
the callback function f
.
Michal Burda
dig()
, var_grid()
; see also dig_correlations()
and
dig_paired_baseline_contrasts()
, as they are using this function internally.
# *** Example of crisp (boolean) patterns:
# dichotomize iris$Species
crispIris <- partition(iris, Species)
# a simple callback function that computes mean difference of `xvar` and `yvar`
f <- function(pd) {
list(m = mean(pd[[1]] - pd[[2]]),
n = nrow(pd))
}
# call f() for each condition created from column `Species`
dig_grid(crispIris,
f,
condition = starts_with("Species"),
xvars = starts_with("Sepal"),
yvars = starts_with("Petal"),
type = "crisp")
# *** Example of fuzzy patterns:
# create fuzzy sets from Sepal columns
fuzzyIris <- partition(iris,
starts_with("Sepal"),
.method = "triangle",
.breaks = 3)
# a simple callback function that computes a weighted mean of a difference of
# `xvar` and `yvar`
f <- function(d, weights) {
list(m = weighted.mean(d[[1]] - d[[2]], w = weights),
w = sum(weights))
}
# call f() for each fuzzy condition created from column fuzzy sets whose
# names start with "Sepal"
dig_grid(fuzzyIris,
f,
condition = starts_with("Sepal"),
xvars = Petal.Length,
yvars = Petal.Width,
type = "fuzzy")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.