knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
It is important to note that nct is based on the NCT function
in the R package NetworkComparisonTest. Key extensions
to the methodology are nonconvex regularization, the de-sparsified
glasso estimator, additional default tests (e.g., KL-divergence), and
user-defined test-statistics, to name but a few.
Furthermore, nct should be much faster for two primary reasons:
glassoFast, introduced in [@sustik2012glassofast], is used instead of glasso.
Parallel computing via parallel's parLapply, whereas
NetworkComparisonTest uses a serially executed for loop
(with cores = 1, GGMncv uses lapply)
The gains in speed will be most notable in larger networks, as shown in the following comparison.
In the following, time, as a function of network size, is
investigated for several cores compared to NCT. Note that
all settings in GGMncv are set to be comparable to NCT,
for example, using lasso and the number of tuning parameters
(i.e., 50). One important distinction is that GGMncv
compute 4 default test-statistic instead of 2
(as in NCT).
library(GGMncv) library(dplyr) library(ggplot2) library(NetworkComparisonTest) p <- seq(10, 25, 5) sim_fun <- function(x){ main <- gen_net(p = x, edge_prob = 0.1) y1 <- MASS::mvrnorm(n = 500, mu = rep(0, x), Sigma = main$cors) y2 <- MASS::mvrnorm(n = 500, mu = rep(0, x), Sigma = main$cors) st_1 <- system.time({ fit_1 <- nct(Y_g1 = y1, Y_g2 = y2, iter = 1000, desparsify = FALSE, penalty = "lasso", cores = 1, progress = FALSE) }) st_2 <- system.time({ fit_1 <- nct(Y_g1 = y1, Y_g2 = y2, iter = 1000, desparsify = FALSE, penalty = "lasso", cores = 2, progress = FALSE) }) st_4 <- system.time({ fit_1 <- nct(Y_g1 = y1, Y_g2 = y2, iter = 1000, desparsify = FALSE, penalty = "lasso", cores = 4, progress = FALSE) }) st_8 <- system.time({ fit_1 <- nct(Y_g1 = y1, Y_g2 = y2, iter = 1000, desparsify = FALSE, penalty = "lasso", cores = 8, progress = FALSE) }) st_NCT <- system.time({ fit_NCT <- NCT(data1 = y1, data2 = y2, it = 1000, progressbar = FALSE) }) ret <- data.frame( times = c(st_1[3], st_2[3], st_4[3], st_8[3], st_NCT[3]), models = c("one", "two", "four", "eight", "NCT"), nodes = x ) return(ret) } sim_res <- list() reps <- 5 for(i in seq_along(p)){ print(paste("nodes", p[i])) sim_res[[i]] <- do.call(rbind.data.frame, replicate(reps, sim_fun(p[i]), simplify = FALSE)) } do.call(rbind.data.frame, sim_res) %>% group_by(models, nodes) %>% summarise(mu = mean(times), std = sd(times)) %>% ggplot(aes(x = nodes, y = mu, group = models)) + theme_bw() + geom_ribbon(aes(ymax = mu + std, ymin = mu - std), alpha = 0.1) + geom_line(aes(color = models), size = 2) + ylab("Seconds") + xlab("Nodes") + scale_color_discrete(name = "Model", labels = c("GGMncv(8)", "GGMncv(4)", "GGMncv(2)", "GGMncv(1)", "NCT"), breaks = c("eight", "four", "two", "one", "NCT"))

The performance gains are clear, especially when using 8 cores with GGMncv.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.