domin | R Documentation |
formula
-based modeling functionsComputes dominance statistics for predictive modeling functions that accept a formula
.
domin(
formula_overall,
reg,
fitstat,
sets = NULL,
all = NULL,
conditional = TRUE,
complete = TRUE,
consmodel = NULL,
reverse = FALSE,
...
)
formula_overall |
An object of class A valid |
reg |
A function implementing the predictive (or "reg"ression) model called. String function names (e.g., "lm"), function names (e.g., The predictive model in |
fitstat |
List providing arguments to call a fit statistic extracting function (see details). The The first element of The second element of All list elements beyond the second are submitted as additional arguments to the fit extractor function call. The fit statistic extractor function in the first list element of The fit statistic produced must be scalar valued (i.e., vector of length 1). |
sets |
A list with each element comprised of vectors containing variable/factor names or Each separate list element-vector in |
all |
A vector of variable/factor names or The entries in |
conditional |
Logical. If If conditional dominance is not desired as an importance criterion, avoiding computing the conditional dominance matrix can save computation time. |
complete |
Logical. If If complete dominance is not desired as an importance criterion, avoiding computing complete dominance designations can save computation time. |
consmodel |
A vector of variable/factor names, The use of Typical usage of As such, this vector is used to set a baseline for the fit statistic when it is non-0. |
reverse |
Logical. If This argument should be changed to |
... |
Additional arguments passed to the function call in the |
domin
automates the computation of all possible combination of entries to the dominance analysis (DA), the creation of formula
objects based on those entries, the modeling calls/fit statistic capture, and the computation of all the dominance statistics for the user.
domin
accepts only a "deconstructed" set of inputs and "reconstructs" them prior to formulating a coherent predictive modeling call.
One specific instance of this deconstruction is in generating the number of entries to the DA. The number of entries is taken as all the terms
from formula_overall
and the separate list element vectors from sets
. The entries themselves are concatenated into a single formula, combined with the entries in all
, and submitted to the predictive modeling function in reg
. Each different combination of entries to the DA forms a different formula
and thus a different model to estimate.
For example, consider this domin
call:
domin(y ~ x1 + x2, lm, list(summary, "r.squared"), sets = list(c("x3", "x4")), all = c("c1", "c2"), data = mydata))
This call records three entries and results in seven (i.e., 2^3 - 1
) different combinations:
x1
x2
x3, x4
x1, x2
x1, x3, x4
x2, x3, x4
x1, x2, x3, x4
domin
parses formula_overall
to obtain all the terms in it and combines them with sets
. When parsing formula_overall
, only the processing that is available in the stats
package is applied. Note that domin
is not programmed to process terms of order > 1 (i.e., interactions/products) appropriately (i.e., only include in the presence of lower order component terms). domin
also does not allow offset
terms.
From these combinations, the predictive models are constructed and called. The predictive model call includes the entries in all
, applies the appropriate formula, and reconstructs the function itself. The seven combinations above imply the following series of predictive model calls:
lm(y ~ x1 + c1 + c2, data = mydata
)
lm(y ~ x2 + c1 + c2, data = mydata
)
lm(y ~ x3 + x4 + c1 + c2, data = mydata
)
lm(y ~ x1 + x2 + c1 + c2, data = mydata
)
lm(y ~ x1 + x3 + x4 + c1 + c2, data = mydata
)
lm(y ~ x2 + x3 + x4 + c1 + c2, data = mydata
)
lm(y ~ x1 + x2 + x3 + x4 + c1 + c2, data = mydata
)
It is possible to use a domin
with only sets (i.e., no IVs in formula_overall
; see examples below). There must be at least two entries to the DA for domin
to run.
All the called predictive models are submitted to the fit extractor function implied by the entries in fitstat
. Again applying the example above, all seven predictive models' objects would be individually passed as follows:
summary(lm_obj)["r.squared"]
where lm_obj
is the model object returned by lm
.
The entries to fitstat
must be as a list and follow a specific structure:
list(fit_function, element_name, ...)
fit_function
First element and function to be applied to the object produced by the reg
function
element_name
Second element and name of the element from the object returned by fit_function
to be used as a fit statistic. The fit statistic must be scalar-valued/length 1
...
Subsequent elements and are additional arguments passed to fit_function
In the case that the model object returned by reg
includes its own fit statistic without the need for an extractor function, the user can apply an anonymous function following the required format to extract it.
Returns an object of class
"domin".
An object of class "domin" is a list composed of the following elements:
General_Dominance
Vector of general dominance statistics.
Standardized
Vector of general dominance statistics normalized to sum to 1.
Ranks
Vector of ranks applied to the general dominance statistics.
Conditional_Dominance
Matrix of conditional dominance statistics. Each row represents a term; each column represents an order of terms.
Complete_Dominance
Logical matrix of complete dominance designations. The term represented in each row indicates dominance status; the terms represented in each columns indicates dominated-by status.
Fit_Statistic_Overall
Value of fit statistic for the full model.
Fit_Statistic_All_Subsets
Value of fit statistic associated with terms in all
.
Fit_Statistic_Constant_Model
Value of fit statistic associated with terms in consmodel
.
Call
The matched call.
Subset_Details
List containing the full model and descriptions of terms in the full model by source.
domin
is an R port of the Stata command with the same name (see Luchman, 2021).
domin
has been superseded by domir
.
Luchman, J. N. (2021). Relative importance analysis in Stata using dominance analysis: domin and domme. The Stata Journal, 21, 2. doi: 10.1177/1536867X211025837.
## Basic linear model with r-square
domin(mpg ~ am + vs + cyl,
lm,
list("summary", "r.squared"),
data = mtcars)
## Linear model including sets
domin(mpg ~ am + vs + cyl,
lm,
list("summary", "r.squared"),
data = mtcars,
sets = list(c("carb", "gear"), c("disp", "wt")))
## Multivariate linear model with custom multivariate r-square function
## and all subsets variable
Rxy <- function(obj, names, data) {
return(list("r2" = cancor(predict(obj),
as.data.frame(mget(names, as.environment(data))))[["cor"]][1]^2))
}
domin(cbind(wt, mpg) ~ vs + cyl + am,
lm,
list(Rxy, "r2", c("mpg", "wt"), mtcars),
data = mtcars,
all = c("carb"))
## Sets only
domin(mpg ~ 1,
lm,
list("summary", "r.squared"),
data = mtcars,
sets = list(c("am", "vs"), c("cyl", "disp"), c("qsec", "carb")))
## Constant model using AIC
domin(mpg ~ am + carb + cyl,
lm,
list(function(x) list(aic = extractAIC(x)[[2]]), "aic"),
data = mtcars,
reverse = TRUE, consmodel = "1")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.