rgcca_stability: Identify the most stable variables with SGCCA

View source: R/rgcca_stability.R

rgcca_stabilityR Documentation

Identify the most stable variables with SGCCA

Description

This function can be used to identify the most stable variables identified as relevant by SGCCA. A Variable Importance in the Projection (VIP) based criterion is used to identify the most stable variables.

Usage

rgcca_stability(
  rgcca_res,
  keep = vapply(rgcca_res$a, function(x) mean(x != 0), FUN.VALUE = 1),
  n_boot = 100,
  n_cores = 1,
  verbose = TRUE,
  balanced = TRUE,
  keep_all_variables = FALSE
)

Arguments

rgcca_res

A fitted RGCCA object (see rgcca).

keep

A numeric vector indicating the proportion of variables per block to select.

n_boot

The number of bootstrap samples (default: 100).

n_cores

The number of cores for parallelization.

verbose

A logical value indicating if the progress of the procedure is reported.

balanced

A logical value indicating if a balanced bootstrap procedure is performed or not (default is TRUE).

keep_all_variables

A logical value indicating if all variables have to be kept even when some of them have null variance for at least one bootstrap sample (default is FALSE).

Value

A rgcca_stability object that can be printed and plotted.

top

A data.frame giving the indicator (VIP) on which the variables are ranked.

n_boot

The number of bootstrap samples, returned for further use.

keepVar

The indices of the most stable variables.

bootstrap

A data.frame with the block weight vectors computed on each bootstrap sample.

rgcca_res

An RGCCA object fitted on the most stable variables.

Examples

## Not run: 
 ###########################
 # stability and bootstrap #
 ###########################

 data("ge_cgh_locIGR", package = "gliomaData")
 blocks <- ge_cgh_locIGR$multiblocks
 Loc <- factor(ge_cgh_locIGR$y)
 levels(Loc) <- colnames(ge_cgh_locIGR$multiblocks$y)
 blocks[[3]] <- Loc

 fit_sgcca <- rgcca(blocks,
    sparsity = c(.071, .2, 1),
    ncomp = c(1, 1, 1),
    scheme = "centroid",
    verbose = TRUE, response = 3
)

 boot_out <- rgcca_bootstrap(fit_sgcca, n_boot = 100, n_cores = 1)

 fit_stab <- rgcca_stability(fit_sgcca,
   keep = sapply(fit_sgcca$a, function(x) mean(x != 0)),
   n_cores = 1, n_boot = 10,
   verbose = TRUE
 )

 boot_out <- rgcca_bootstrap(
   fit_stab, n_boot = 500, n_cores = 1, verbose = TRUE
 )

 plot(boot_out, block = 1:2, n_mark = 2000, display_order = FALSE)

## End(Not run)

Tenenhaus/RGCCA documentation built on Feb. 12, 2024, 8:34 a.m.