Power simulations using multiple approaches for internal validation"
In graphicalMCP: Graphical Multiple Comparison Procedures

knitr::opts_chunk$set(
  fig.align = "center",
  collapse = TRUE,
  comment = "#>",
  cache.lazy = FALSE
)

library(graphicalMCP)

Introduction

Multiple approaches are implemented in graphicalMCP to reject a hypothesis for different purposes and/or considerations. One approach is to calculate the adjusted p-value of this hypothesis and compare it with alpha. This approach is implemented in adjusted_p functions and graph_test_closure() (when test_values = FALSE). Another approach is to calculate the adjusted significance level of this hypothesis and compare it with its p-value. This approach is implemented in adjusted_weights functions, graph_test_closure() (when test_values = TRUE), and graph_calculate_power(). To further tailor this approach for different outputs, a different way of coding are used for graph_test_closure() (when test_values = TRUE). When implementing these approaches in graph_calculate_power(), variations are added to optimize computing speed. Thus, these approaches could be compared with each other for internal validation.

Power simulations

A random graph will be generated and used for the comparison. A set of marginal power (without multiplicity adjustment) is randomly generated. Local power (with multiplicity adjustment) is calculated using graph_calculate_power(). In addition, p-values simulated from graph_calculate_power() are saved. These p-values are used to generate local power via graph_test_shortcut() and graph_test_closure() as the proportion of times every hypothesis can be rejected. We expect to observe matching results for 1000 random graphs.

Bonferroni tests

We compare power simulations from graph_calculate_power() and those using graph_test_shortcut() via respectively the adjusted p-value approach and the adjusted significance level approach.

out <- read.csv(here::here("vignettes/internal-validation_bonferroni.csv"))
# Matching power using the adjusted p-value approach
all.equal(out$adjusted_p, rep(TRUE, nrow(out)))
# Matching power using the adjusted significance level approach
all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out)))

Hochberg tests

out <- read.csv(here::here("vignettes/internal-validation_hochberg.csv"))
# Matching power using the adjusted p-value approach
all.equal(out$adjusted_p, rep(TRUE, nrow(out)))
# Matching power using the adjusted significance level approach
all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out)))

Simes tests

out <- read.csv(here::here("vignettes/internal-validation_simes.csv"))
# Matching power using the adjusted p-value approach
all.equal(out$adjusted_p, rep(TRUE, nrow(out)))
# Matching power using the adjusted significance level approach
all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out)))

Parametric tests

out <- read.csv(here::here("vignettes/internal-validation_parametric.csv"))
# Matching power using the adjusted p-value approach
all.equal(out$adjusted_p, rep(TRUE, nrow(out)))
# Matching power using the adjusted significance level approach
all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out)))

Mixed tests of Bonferroni, Hochberg and Simes

We compare power simulations from graph_calculate_power() and those using graph_test_closure() via respectively the adjusted p-value approach and the adjusted significance level approach. Two test groups are used with randomly assigned hypotheses. Two test types are randomly picked among Bonferroni, Hochberg and Simes tests.

out <- read.csv(here::here("vignettes/internal-validation_mixed.csv"))
# Matching power using the adjusted p-value approach
all.equal(out$adjusted_p, rep(TRUE, nrow(out)))
# Matching power using the adjusted significance level approach
all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out)))

Mixed tests of parametric and one of Bonferroni, Hochberg and Simes

We compare power simulations from graph_calculate_power() and those using graph_test_closure() via respectively the adjusted p-value approach and the adjusted significance level approach. Two test groups are used with randomly assigned hypotheses. Parametric test type is assigned to the first test group and the test type for the second test group is randomly picked among Bonferroni, Hochberg and Simes tests.

out <- read.csv(here::here("vignettes/internal-validation_parametric-mixed.csv"))
# Matching power using the adjusted p-value approach
all.equal(out$adjusted_p, rep(TRUE, nrow(out)))
# Matching power using the adjusted significance level approach
all.equal(out$adjusted_significance_level, rep(TRUE, nrow(out)))

Conclusions

Multiple approaches are implemented in graphicalMCP to reject a hypothesis for different purposes and/or considerations. One approach is to calculate the adjusted p-value of this hypothesis and compare it with alpha. Another approach is to calculate the adjusted significance level of this hypothesis and compare it with its p-value. Based on 1000 random graphs, these two approaches produce matching power for all types of tests. Therefore, the internal validation is considered to be complete.

n_graph <- 1000
m <- 6
alpha <- 0.025
n_sim <- 1e2
sim_corr <- matrix(0.5, m, m)
diag(sim_corr) <- 1
out_adjusted_p <- rep(NA, n_graph)
out_adjusted_significance_level <- rep(NA, n_graph)
for (i in 1:n_graph) {
  set.seed(1234 + i - 1)
  graph <- random_graph(m)
  marginal_power <- runif(m, 0.5, 0.9)
  results_calculate_power <- graph_calculate_power(
    graph,
    alpha = alpha,
    power_marginal = marginal_power,
    sim_corr = sim_corr,
    sim_n = n_sim,
    verbose = TRUE
  )
  p_sim <- results_calculate_power$details$p_sim
  adjusted_p_results <- p_sim
  adjusted_significance_level_results <- p_sim
  for (j in 1:n_sim) {
    temp <- graph_test_shortcut(
      graph,
      p_sim[j, ],
      alpha = alpha,
      test_values = TRUE
    )
    adjusted_p_results[j, ] <- temp$outputs$rejected
    temp_adjusted_significance_level <- temp$test_values$results$Inequality_holds
    names(temp_adjusted_significance_level) <- temp$test_values$results$Hypothesis
    adjusted_significance_level_results[j, ] <-
      temp_adjusted_significance_level[names(temp$outputs$rejected)]
  }
  results_adjusted_p <- colMeans(adjusted_p_results)
  names(results_adjusted_p) <- names(temp$outputs$rejected)
  # Matching power using the adjusted p-value approach
  out_adjusted_p[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_adjusted_p,
    tolerance = 1 / n_sim
  )

  results_significance_level <- colMeans(adjusted_significance_level_results)
  names(results_significance_level) <- names(temp$outputs$rejected)
  # Matching power using the adjusted significance level approach
  out_adjusted_significance_level[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_significance_level,
    tolerance = 1 / n_sim
  )
}
out <- cbind(out_adjusted_p, out_adjusted_significance_level)
colnames(out) <- c("adjusted_p", "adjusted_significance_level")
write.csv(
  out,
  here::here("vignettes/internal-validation_bonferroni.csv"),
  row.names = FALSE
)

n_graph <- 1000
m <- 4
alpha <- 0.025
n_sim <- 1e2
sim_corr <- matrix(0.5, m, m)
diag(sim_corr) <- 1
out_adjusted_p <- rep(NA, n_graph)
out_adjusted_significance_level <- rep(NA, n_graph)
for (i in 1:n_graph) {
  set.seed(1234 + i - 1)
  groups <- sample(1:m)
  test_groups <- list(groups[1:(m / 2)], groups[(m / 2 + 1):m])
  graph <- random_graph(m)
  graph$hypotheses[groups[1:(m / 2)]] <- sum(graph$hypotheses[groups[1:(m / 2)]]) / 3
  graph$hypotheses[groups[(m / 2 + 1):m]] <- sum(graph$hypotheses[groups[(m / 2 + 1):m]]) / 3
  graph$transitions <- matrix(1 / (m - 1), nrow = m, ncol = m)
  diag(graph$transitions) <- 0
  marginal_power <- runif(m, 0.5, 0.9)
  results_calculate_power <- graph_calculate_power(
    graph,
    alpha = alpha,
    power_marginal = marginal_power,
    sim_corr = sim_corr,
    sim_n = n_sim,
    test_types = c("hochberg", "hochberg"),
    test_groups = test_groups,
    verbose = TRUE
  )
  p_sim <- results_calculate_power$details$p_sim
  adjusted_p_results <- p_sim
  adjusted_significance_level_results <- p_sim
  for (j in 1:n_sim) {
    temp <- graph_test_closure(
      graph,
      p_sim[j, ],
      alpha = alpha,
      test_types = c("hochberg", "hochberg"),
      test_groups = test_groups,
      test_values = TRUE
    )
    adjusted_p_results[j, ] <- temp$outputs$rejected
    intersection <- unique(temp$test_values$results$Intersection)
    intersection_rej <- intersection
    for (k in 1:length(intersection)) {
      intersection_rej[k] <- any(subset(
        temp$test_values$results,
        Intersection == intersection[k]
      )$Inequality_holds)
    }
    for (k in 1:m) {
      id <- substr(intersection, start = k, stop = k) == "1"
      adjusted_significance_level_results[j, k] <- suppressWarnings(all(intersection_rej[id]))
    }
  }
  results_adjusted_p <- colMeans(adjusted_p_results)
  names(results_adjusted_p) <- names(temp$outputs$rejected)
  # Matching power using the adjusted p-value approach
  out_adjusted_p[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_adjusted_p,
    tolerance = 1 / n_sim
  )

  results_significance_level <- colMeans(adjusted_significance_level_results)
  names(results_significance_level) <- names(temp$outputs$rejected)
  # Matching power using the adjusted significance level approach
  out_adjusted_significance_level[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_significance_level,
    tolerance = 1 / n_sim
  )
}
out <- cbind(out_adjusted_p, out_adjusted_significance_level)
colnames(out) <- c("adjusted_p", "adjusted_significance_level")
write.csv(
  out,
  here::here("vignettes/internal-validation_hochberg.csv"),
  row.names = FALSE
)

n_graph <- 1000
m <- 4
alpha <- 0.025
n_sim <- 1e2
sim_corr <- matrix(0.5, m, m)
diag(sim_corr) <- 1
out_adjusted_p <- rep(NA, n_graph)
out_adjusted_significance_level <- rep(NA, n_graph)
for (i in 1:n_graph) {
  set.seed(1234 + i - 1)
  groups <- sample(1:m)
  test_groups <- list(groups[1:(m / 2)], groups[(m / 2 + 1):m])
  graph <- random_graph(m)
  marginal_power <- runif(m, 0.5, 0.9)
  results_calculate_power <- graph_calculate_power(
    graph,
    alpha = alpha,
    power_marginal = marginal_power,
    sim_corr = sim_corr,
    sim_n = n_sim,
    test_types = c("simes", "simes"),
    test_groups = test_groups,
    verbose = TRUE
  )
  p_sim <- results_calculate_power$details$p_sim
  adjusted_p_results <- p_sim
  adjusted_significance_level_results <- p_sim
  for (j in 1:n_sim) {
    temp <- graph_test_closure(
      graph,
      p_sim[j, ],
      alpha = alpha,
      test_types = c("simes", "simes"),
      test_groups = test_groups,
      test_values = TRUE
    )
    adjusted_p_results[j, ] <- temp$outputs$rejected
    intersection <- unique(temp$test_values$results$Intersection)
    intersection_rej <- intersection
    for (k in 1:length(intersection)) {
      intersection_rej[k] <- any(subset(
        temp$test_values$results,
        Intersection == intersection[k]
      )$Inequality_holds)
    }
    for (k in 1:m) {
      id <- substr(intersection, start = k, stop = k) == "1"
      adjusted_significance_level_results[j, k] <- suppressWarnings(all(intersection_rej[id]))
    }
  }
  results_adjusted_p <- colMeans(adjusted_p_results)
  names(results_adjusted_p) <- names(temp$outputs$rejected)
  # Matching power using the adjusted p-value approach
  out_adjusted_p[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_adjusted_p,
    tolerance = 1 / n_sim
  )

  results_significance_level <- colMeans(adjusted_significance_level_results)
  names(results_significance_level) <- names(temp$outputs$rejected)
  # Matching power using the adjusted significance level approach
  out_adjusted_significance_level[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_significance_level,
    tolerance = 1 / n_sim
  )
}
out <- cbind(out_adjusted_p, out_adjusted_significance_level)
colnames(out) <- c("adjusted_p", "adjusted_significance_level")
write.csv(
  out,
  here::here("vignettes/internal-validation_simes.csv"),
  row.names = FALSE
)

n_graph <- 1000
m <- 4
alpha <- 0.025
n_sim <- 1e2
sim_corr <- matrix(0.5, m, m)
diag(sim_corr) <- 1
out_adjusted_p <- rep(NA, n_graph)
out_adjusted_significance_level <- rep(NA, n_graph)
for (i in 1:n_graph) {
  set.seed(1234 + i - 1)
  groups <- sample(1:m)
  test_groups <- list(groups[1:(m / 2)], groups[(m / 2 + 1):m])
  test_corr <- list(
    sim_corr[groups[1:(m / 2)], groups[1:(m / 2)]],
    sim_corr[groups[(m / 2 + 1):m], groups[(m / 2 + 1):m]]
  )
  graph <- random_graph(m)
  marginal_power <- runif(m, 0.5, 0.9)
  results_calculate_power <- graph_calculate_power(
    graph,
    alpha = alpha,
    power_marginal = marginal_power,
    sim_corr = sim_corr,
    sim_n = n_sim,
    test_types = c("parametric", "parametric"),
    test_groups = test_groups,
    test_corr = test_corr,
    verbose = TRUE
  )
  p_sim <- results_calculate_power$details$p_sim
  adjusted_p_results <- p_sim
  adjusted_significance_level_results <- p_sim
  for (j in 1:n_sim) {
    temp <- graph_test_closure(
      graph,
      p_sim[j, ],
      alpha = alpha,
      test_types = c("parametric", "parametric"),
      test_groups = test_groups,
      test_corr = test_corr,
      test_values = TRUE
    )
    adjusted_p_results[j, ] <- temp$outputs$rejected
    intersection <- unique(temp$test_values$results$Intersection)
    intersection_rej <- intersection
    for (k in 1:length(intersection)) {
      intersection_rej[k] <- any(subset(
        temp$test_values$results,
        Intersection == intersection[k]
      )$Inequality_holds)
    }
    for (k in 1:m) {
      id <- substr(intersection, start = k, stop = k) == "1"
      adjusted_significance_level_results[j, k] <- suppressWarnings(all(intersection_rej[id]))
    }
  }
  results_adjusted_p <- colMeans(adjusted_p_results)
  names(results_adjusted_p) <- names(temp$outputs$rejected)
  # Matching power using the adjusted p-value approach
  out_adjusted_p[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_adjusted_p,
    tolerance = 1 / n_sim
  )

  results_significance_level <- colMeans(adjusted_significance_level_results)
  names(results_significance_level) <- names(temp$outputs$rejected)
  # Matching power using the adjusted significance level approach
  out_adjusted_significance_level[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_significance_level,
    tolerance = 1 / n_sim
  )
}
out <- cbind(out_adjusted_p, out_adjusted_significance_level)
colnames(out) <- c("adjusted_p", "adjusted_significance_level")
write.csv(
  out,
  here::here("vignettes/internal-validation_parametric.csv"),
  row.names = FALSE
)

n_graph <- 1000
m <- 4
alpha <- 0.025
n_sim <- 1e2
sim_corr <- matrix(0.5, m, m)
diag(sim_corr) <- 1
out_adjusted_p <- rep(NA, n_graph)
out_adjusted_significance_level <- rep(NA, n_graph)
for (i in 1:n_graph) {
  set.seed(1234 + i - 1)
  groups <- sample(1:m)
  test_groups <- list(groups[1:(m / 2)], groups[(m / 2 + 1):m])
  test_corr <- list(
    sim_corr[groups[1:(m / 2)], groups[1:(m / 2)]],
    sim_corr[groups[(m / 2 + 1):m], groups[(m / 2 + 1):m]]
  )
  test_types <- sample(c("bonferroni", "hochberg", "simes"), 2)
  graph <- random_graph(m)
  graph$hypotheses[groups[1:(m / 2)]] <- sum(graph$hypotheses[groups[1:(m / 2)]]) / 3
  graph$hypotheses[groups[(m / 2 + 1):m]] <- sum(graph$hypotheses[groups[(m / 2 + 1):m]]) / 3
  graph$transitions <- matrix(1 / (m - 1), nrow = m, ncol = m)
  diag(graph$transitions) <- 0
  marginal_power <- runif(m, 0.5, 0.9)
  results_calculate_power <- graph_calculate_power(
    graph,
    alpha = alpha,
    power_marginal = marginal_power,
    sim_corr = sim_corr,
    sim_n = n_sim,
    test_types = test_types,
    test_groups = test_groups,
    verbose = TRUE
  )
  p_sim <- results_calculate_power$details$p_sim
  adjusted_p_results <- p_sim
  adjusted_significance_level_results <- p_sim
  for (j in 1:n_sim) {
    temp <- graph_test_closure(
      graph,
      p_sim[j, ],
      alpha = alpha,
      test_types = test_types,
      test_groups = test_groups,
      test_values = TRUE
    )
    adjusted_p_results[j, ] <- temp$outputs$rejected
    intersection <- unique(temp$test_values$results$Intersection)
    intersection_rej <- intersection
    for (k in 1:length(intersection)) {
      intersection_rej[k] <- any(subset(
        temp$test_values$results,
        Intersection == intersection[k]
      )$Inequality_holds)
    }
    for (k in 1:m) {
      id <- substr(intersection, start = k, stop = k) == "1"
      adjusted_significance_level_results[j, k] <- suppressWarnings(all(intersection_rej[id]))
    }
  }
  results_adjusted_p <- colMeans(adjusted_p_results)
  names(results_adjusted_p) <- names(temp$outputs$rejected)
  # Matching power using the adjusted p-value approach
  out_adjusted_p[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_adjusted_p,
    tolerance = 1 / n_sim
  )

  results_significance_level <- colMeans(adjusted_significance_level_results)
  names(results_significance_level) <- names(temp$outputs$rejected)
  # Matching power using the adjusted significance level approach
  out_adjusted_significance_level[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_significance_level,
    tolerance = 1 / n_sim
  )
}
out <- cbind(out_adjusted_p, out_adjusted_significance_level)
colnames(out) <- c("adjusted_p", "adjusted_significance_level")
write.csv(
  out,
  here::here("vignettes/internal-validation_mixed.csv"),
  row.names = FALSE
)

n_graph <- 1000
m <- 4
alpha <- 0.025
n_sim <- 1e2
sim_corr <- matrix(0.5, m, m)
diag(sim_corr) <- 1
out_adjusted_p <- rep(NA, n_graph)
out_adjusted_significance_level <- rep(NA, n_graph)
for (i in 1:n_graph) {
  set.seed(1234 + i - 1)
  groups <- sample(1:m)
  test_groups <- list(groups[1:(m / 2)], groups[(m / 2 + 1):m])
  test_corr <- list(sim_corr[groups[1:(m / 2)], groups[1:(m / 2)]], NA)
  test_types <- c(
    "parametric",
    sample(c("bonferroni", "hochberg", "simes"), 1)
  )
  graph <- random_graph(m)
  graph$hypotheses[groups[1:(m / 2)]] <- sum(graph$hypotheses[groups[1:(m / 2)]]) / 3
  graph$hypotheses[groups[(m / 2 + 1):m]] <- sum(graph$hypotheses[groups[(m / 2 + 1):m]]) / 3
  graph$transitions <- matrix(1 / (m - 1), nrow = m, ncol = m)
  diag(graph$transitions) <- 0
  marginal_power <- runif(m, 0.5, 0.9)
  results_calculate_power <- graph_calculate_power(
    graph,
    alpha = alpha,
    power_marginal = marginal_power,
    sim_corr = sim_corr,
    sim_n = n_sim,
    test_types = test_types,
    test_groups = test_groups,
    test_corr = test_corr,
    verbose = TRUE
  )
  p_sim <- results_calculate_power$details$p_sim
  adjusted_p_results <- p_sim
  adjusted_significance_level_results <- p_sim
  for (j in 1:n_sim) {
    temp <- graph_test_closure(
      graph,
      p_sim[j, ],
      alpha = alpha,
      test_types = test_types,
      test_groups = test_groups,
      test_corr = test_corr,
      test_values = TRUE
    )
    adjusted_p_results[j, ] <- temp$outputs$rejected
    intersection <- unique(temp$test_values$results$Intersection)
    intersection_rej <- intersection
    for (k in 1:length(intersection)) {
      intersection_rej[k] <- any(subset(
        temp$test_values$results,
        Intersection == intersection[k]
      )$Inequality_holds)
    }
    for (k in 1:m) {
      id <- substr(intersection, start = k, stop = k) == "1"
      adjusted_significance_level_results[j, k] <- suppressWarnings(all(intersection_rej[id]))
    }
  }
  results_adjusted_p <- colMeans(adjusted_p_results)
  names(results_adjusted_p) <- names(temp$outputs$rejected)
  # Matching power using the adjusted p-value approach
  out_adjusted_p[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_adjusted_p,
    tolerance = 1 / n_sim
  )

  results_significance_level <- colMeans(adjusted_significance_level_results)
  names(results_significance_level) <- names(temp$outputs$rejected)
  # Matching power using the adjusted significance level approach
  out_adjusted_significance_level[i] <- all.equal(
    results_calculate_power$power$power_local,
    results_significance_level,
    tolerance = 1 / n_sim
  )
}
out <- cbind(out_adjusted_p, out_adjusted_significance_level)
colnames(out) <- c("adjusted_p", "adjusted_significance_level")
write.csv(
  out,
  here::here("vignettes/internal-validation_parametric-mixed.csv"),
  row.names = FALSE
)