View source: R/ggm_compare_ppc.default.R
ggm_compare_ppc | R Documentation |
Compare GGMs with a posterior predicitve check \insertCitegelman1996posteriorBGGM.
This method was introduced in \insertCitewilliams2020comparing;textualBGGM. Currently,
there is a global
(the entire GGM) and a nodewise
test. The default
is to compare GGMs with respect to the posterior predictive distribution of Kullback
Leibler divergence and the sum of squared errors. It is also possible to compare the
GGMs with a user defined test-statistic.
ggm_compare_ppc(
...,
test = "global",
iter = 5000,
FUN = NULL,
custom_obs = NULL,
loss = TRUE,
progress = TRUE
)
... |
At least two matrices (or data frames) of dimensions n (observations) by p (variables). |
test |
Which test should be performed (defaults to |
iter |
Number of replicated datasets used to construct the predictivie distribution (defaults to 5000). |
FUN |
An optional function for comparing GGMs that returns a number. See Details. |
custom_obs |
Number corresponding to the observed score for comparing the GGMs. This is
required if a function is provided in |
loss |
Logical. If a function is provided, is the measure a "loss function" (i.e., a large score is bad thing). This determines how the p-value is computed. See Details. |
progress |
Logical. Should a progress bar be included (defaults to |
The FUN
argument allows for a user defined test-statisic (the measure used to compare the GGMs).
The function must include only two agruments, each of which corresponds to a dataset. For example,
f <- function(Yg1, Yg2)
, where each Y is dataset of dimensions n by p. The
groups are then compare within the function, returning a single number. An example is provided below.
Further, when using a custom function care must be taken when specifying the argument loss
.
We recommended to visualize the results with plot
to ensure the p-value was computed
in the right direction.
The returned object of class ggm_compare_ppc
contains a lot of information that
is used for printing and plotting the results. For users of BGGM, the following
are the useful objects:
test = "global"
ppp_jsd
posterior predictive p-values (JSD).
ppp_sse
posterior predictive p-values (SSE).
predictive_jsd
list containing the posterior predictive distributions (JSD).
predictive_sse
list containing the posterior predictive distributions (SSE).
obs_jsd
list containing the observed error (JSD).
obs_sse
list containing the observed error (SSE).
test = "nodewise"
ppp_jsd
posterior predictive p-values (JSD).
predictive_jsd
list containing the posterior predictive distributions (JSD).
obs_jsd
list containing the observed error (JSD).
FUN = f()
ppp_custom
posterior predictive p-values (custom).
predictive_custom
posterior predictive distributions (custom).
obs_custom
observed error (custom).
Interpretation:
The primary test-statistic is symmetric KL-divergence that is termed Jensen-Shannon divergence (JSD). This is in essence a likelihood ratio that provides the "distance" between two multivariate normal distributions. The basic idea is to (1) compute the posterior predictive distribution, assuming group equality (the null model). This provides the error that we would expect to see under the null model; (2) compute JSD for the observed groups; and (3) compare the observed JSD to the posterior predictive distribution, from which a posterior predictive p-value is computed.
For the global
check, the sum of squared error is also provided.
This is computed from the partial correlation matrices and it is analagous
to the strength test in \insertCitevan2017comparing;textualBGGM. The nodewise
test compares the posterior predictive distribution for each node. This is based on the correspondence
between the inverse covariance matrix and multiple regresssion \insertCitekwan2014regression,Stephens1998BGGM.
If the null model is not
rejected, note that this does not
provide evidence for equality!
Further, if the null model is rejected, this means that the assumption of group equality is not tenable–the
groups are different.
Alternative Methods:
There are several methods in BGGM for comparing groups. See
ggm_compare_estimate
(posterior differences for the
partial correlations), ggm_compare_explore
(exploratory hypothesis testing),
and ggm_compare_confirm
(confirmatory hypothesis testing).
# note: iter = 250 for demonstrative purposes
# data
Y <- bfi
#############################
######### global ############
#############################
# males
Ym <- subset(Y, gender == 1,
select = - c(gender, education))
# females
Yf <- subset(Y, gender == 2,
select = - c(gender, education))
global_test <- ggm_compare_ppc(Ym, Yf,
iter = 250)
global_test
#############################
###### custom function ######
#############################
# example 1
# maximum difference van Borkulo et al. (2017)
f <- function(Yg1, Yg2){
# remove NA
x <- na.omit(Yg1)
y <- na.omit(Yg2)
# nodes
p <- ncol(Yg1)
# identity matrix
I_p <- diag(p)
# partial correlations
pcor_1 <- -(cov2cor(solve(cor(x))) - I_p)
pcor_2 <- -(cov2cor(solve(cor(y))) - I_p)
# max difference
max(abs((pcor_1[upper.tri(I_p)] - pcor_2[upper.tri(I_p)])))
}
# observed difference
obs <- f(Ym, Yf)
global_max <- ggm_compare_ppc(Ym, Yf,
iter = 250,
FUN = f,
custom_obs = obs,
progress = FALSE)
global_max
# example 2
# Hamming distance (squared error for adjacency)
f <- function(Yg1, Yg2){
# remove NA
x <- na.omit(Yg1)
y <- na.omit(Yg2)
# nodes
p <- ncol(x)
# identity matrix
I_p <- diag(p)
fit1 <- estimate(x, analytic = TRUE)
fit2 <- estimate(y, analytic = TRUE)
sel1 <- select(fit1)
sel2 <- select(fit2)
sum((sel1$adj[upper.tri(I_p)] - sel2$adj[upper.tri(I_p)])^2)
}
# observed difference
obs <- f(Ym, Yf)
global_hd <- ggm_compare_ppc(Ym, Yf,
iter = 250,
FUN = f,
custom_obs = obs,
progress = FALSE)
global_hd
#############################
######## nodewise ##########
#############################
nodewise <- ggm_compare_ppc(Ym, Yf, iter = 250,
test = "nodewise")
nodewise
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.