| div_gof | R Documentation |
Performs divergence-based goodness-of-fit tests for discrete data, including tests of uniformity, pairwise independence, conditional independence, and nested model comparisons.
div_gof(
dat,
var_uniform = NULL,
var1 = NULL,
var2 = NULL,
var_cond = NULL,
model_full = NULL,
model_reduced = NULL,
alpha = 0.05,
dec = 3,
use_approx_cv = TRUE
)
dat |
dataframe with rows as observations and columns as variables. Variables must be categorical with finite range spaces. |
var_uniform |
character name of a variable in |
var1 |
character name of the first variable. |
var2 |
character name of the second variable. |
var_cond |
optional character vector of conditioning variables. |
model_full |
list containing |
model_reduced |
list containing |
alpha |
significance level. Default is 0.05. |
dec |
number of decimals for rounding. Default is 3. |
use_approx_cv |
logical; if |
The function implements four types of tests:
1. Uniformity
D = \log r_X - H(X)
2. Pairwise Independence
D = H(X) + H(Y) - H(X,Y)
3. Conditional Independence
D = H(X,Z) + H(Y,Z) - H(Z) - H(X,Y,Z)
where Z may also represent a vector of conditioning variables.
4. Nested Model Comparison
D = D_{reduced} - D_{full}
The test statistic is
2nD\log(2),
since entropies are computed using base 2 logarithms.
Smaller divergence values indicate better model fit.
Dataframe with test type, divergence D, chi-square statistic, degrees of freedom, critical value, and decision.
Termeh Shafie
Frank, O., & Shafie, T. (2016). Multivariate entropy analysis of network data. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 129(1), 45-63.
joint_entropy, entropy_trivar
data(lawdata)
df_att <- lawdata[[4]]
att_var <- data.frame(
status = df_att$status - 1,
gender = df_att$gender,
office = df_att$office - 1,
years = ifelse(df_att$years <= 3, 0,
ifelse(df_att$years <= 13, 1, 2)),
age = ifelse(df_att$age <= 35, 0,
ifelse(df_att$age <= 45, 1, 2)),
practice = df_att$practice,
lawschool = df_att$lawschool - 1
)
## 1. Test uniformity
div_gof(att_var, var_uniform = "gender")
## 2. Test pairwise independence
div_gof(att_var, var1 = "status", var2 = "gender")
## 3. Test conditional independence
## (a) Conditional independence given a single variable
div_gof(att_var,
var1 = "status",
var2 = "gender",
var_cond = "years")
## (b) Conditional independence given multiple variables
div_gof(att_var,
var1 = "status",
var2 = "gender",
var_cond = c("years", "age"))
## 4. Nested model comparison
## Compare reduced models against the saturated empirical model.
## The saturated model has divergence D = 0 and df = 0.
m_full <- list(D = 0, df = 0)
## (a) Pairwise independence model
m_reduced <- div_gof(att_var,
var1 = "status",
var2 = "gender")
div_gof(att_var,
model_full = m_full,
model_reduced = list(D = m_reduced$D, df = m_reduced$df))
## (b) Conditional independence model
m_reduced <- div_gof(att_var,
var1 = "status",
var2 = "gender",
var_cond = "years")
div_gof(att_var,
model_full = m_full,
model_reduced = list(D = m_reduced$D, df = m_reduced$df))
## 5. Nested comparison against the saturated empirical model
m_full <- list(D = 0, df = 0)
m_reduced <- div_gof(att_var,
var1 = "status",
var2 = "gender")
div_gof(att_var,
model_full = m_full,
model_reduced = list(D = m_reduced$D, df = m_reduced$df))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.