ggm_compare_ppc: GGM Compare: Posterior Predictive Check

View source: R/ggm_compare_ppc.default.R

ggm_compare_ppcR Documentation

GGM Compare: Posterior Predictive Check

Description

Compare GGMs with a posterior predicitve check \insertCitegelman1996posteriorBGGM. This method was introduced in \insertCitewilliams2020comparing;textualBGGM. Currently, there is a global (the entire GGM) and a nodewise test. The default is to compare GGMs with respect to the posterior predictive distribution of Kullback Leibler divergence and the sum of squared errors. It is also possible to compare the GGMs with a user defined test-statistic.

Usage

ggm_compare_ppc(
  ...,
  test = "global",
  iter = 5000,
  FUN = NULL,
  custom_obs = NULL,
  loss = TRUE,
  progress = TRUE
)

Arguments

...

At least two matrices (or data frames) of dimensions n (observations) by p (variables).

test

Which test should be performed (defaults to "global") ? The options include global and nodewise.

iter

Number of replicated datasets used to construct the predictivie distribution (defaults to 5000).

FUN

An optional function for comparing GGMs that returns a number. See Details.

custom_obs

Number corresponding to the observed score for comparing the GGMs. This is required if a function is provided in FUN. See Details.

loss

Logical. If a function is provided, is the measure a "loss function" (i.e., a large score is bad thing). This determines how the p-value is computed. See Details.

progress

Logical. Should a progress bar be included (defaults to TRUE) ?

Details

The FUN argument allows for a user defined test-statisic (the measure used to compare the GGMs). The function must include only two agruments, each of which corresponds to a dataset. For example, f <- function(Yg1, Yg2), where each Y is dataset of dimensions n by p. The groups are then compare within the function, returning a single number. An example is provided below.

Further, when using a custom function care must be taken when specifying the argument loss. We recommended to visualize the results with plot to ensure the p-value was computed in the right direction.

Value

The returned object of class ggm_compare_ppc contains a lot of information that is used for printing and plotting the results. For users of BGGM, the following are the useful objects:

test = "global"

  • ppp_jsd posterior predictive p-values (JSD).

  • ppp_sse posterior predictive p-values (SSE).

  • predictive_jsd list containing the posterior predictive distributions (JSD).

  • predictive_sse list containing the posterior predictive distributions (SSE).

  • obs_jsd list containing the observed error (JSD).

  • obs_sse list containing the observed error (SSE).

test = "nodewise"

  • ppp_jsd posterior predictive p-values (JSD).

  • predictive_jsd list containing the posterior predictive distributions (JSD).

  • obs_jsd list containing the observed error (JSD).

FUN = f()

  • ppp_custom posterior predictive p-values (custom).

  • predictive_custom posterior predictive distributions (custom).

  • obs_custom observed error (custom).

Note

Interpretation:

The primary test-statistic is symmetric KL-divergence that is termed Jensen-Shannon divergence (JSD). This is in essence a likelihood ratio that provides the "distance" between two multivariate normal distributions. The basic idea is to (1) compute the posterior predictive distribution, assuming group equality (the null model). This provides the error that we would expect to see under the null model; (2) compute JSD for the observed groups; and (3) compare the observed JSD to the posterior predictive distribution, from which a posterior predictive p-value is computed.

For the global check, the sum of squared error is also provided. This is computed from the partial correlation matrices and it is analagous to the strength test in \insertCitevan2017comparing;textualBGGM. The nodewise test compares the posterior predictive distribution for each node. This is based on the correspondence between the inverse covariance matrix and multiple regresssion \insertCitekwan2014regression,Stephens1998BGGM.

If the null model is not rejected, note that this does not provide evidence for equality! Further, if the null model is rejected, this means that the assumption of group equality is not tenable–the groups are different.

Alternative Methods:

There are several methods in BGGM for comparing groups. See ggm_compare_estimate (posterior differences for the partial correlations), ggm_compare_explore (exploratory hypothesis testing), and ggm_compare_confirm (confirmatory hypothesis testing).

References

\insertAllCited

Examples



# note: iter = 250 for demonstrative purposes

# data
Y <- bfi

#############################
######### global ############
#############################


# males
Ym <- subset(Y, gender == 1,
             select = - c(gender, education))

# females

Yf <- subset(Y, gender == 2,
             select = - c(gender, education))


global_test <- ggm_compare_ppc(Ym, Yf,
                               iter = 250)

global_test


#############################
###### custom function ######
#############################
# example 1

# maximum difference van Borkulo et al. (2017)

f <- function(Yg1, Yg2){

# remove NA
x <- na.omit(Yg1)
y <- na.omit(Yg2)

# nodes
p <- ncol(Yg1)

# identity matrix
I_p <- diag(p)

# partial correlations

pcor_1 <- -(cov2cor(solve(cor(x))) - I_p)
pcor_2 <- -(cov2cor(solve(cor(y))) - I_p)

# max difference
max(abs((pcor_1[upper.tri(I_p)] - pcor_2[upper.tri(I_p)])))

}

# observed difference
obs <- f(Ym, Yf)

global_max <- ggm_compare_ppc(Ym, Yf,
                              iter = 250,
                              FUN = f,
                              custom_obs = obs,
                              progress = FALSE)

global_max


# example 2
# Hamming distance (squared error for adjacency)

f <- function(Yg1, Yg2){

# remove NA
x <- na.omit(Yg1)
y <- na.omit(Yg2)

# nodes
p <- ncol(x)

# identity matrix
I_p <- diag(p)

fit1 <-  estimate(x, analytic = TRUE)
fit2 <-  estimate(y, analytic = TRUE)

sel1 <- select(fit1)
sel2 <- select(fit2)

sum((sel1$adj[upper.tri(I_p)] - sel2$adj[upper.tri(I_p)])^2)

}

# observed difference
obs <- f(Ym, Yf)

global_hd <- ggm_compare_ppc(Ym, Yf,
                            iter = 250,
                            FUN = f,
                            custom_obs  = obs,
                            progress = FALSE)

global_hd


#############################
########  nodewise ##########
#############################

nodewise <- ggm_compare_ppc(Ym, Yf, iter = 250,
                           test = "nodewise")

nodewise




BGGM documentation built on Sept. 11, 2024, 5:19 p.m.