decouple: Evaluate multiple statistics with same input data

View source: R/decoupleR-decouple.R

decoupleR Documentation

Evaluate multiple statistics with same input data

Description

Calculate the source activity per sample out of a gene expression matrix by coupling a regulatory network with a variety of statistics.

Usage

decouple(
  mat,
  network,
  .source = source,
  .target = target,
  statistics = NULL,
  args = list(NULL),
  consensus_score = TRUE,
  consensus_stats = NULL,
  include_time = FALSE,
  show_toy_call = FALSE,
  minsize = 5
)

Arguments

mat

Matrix to evaluate (e.g. expression matrix). Target nodes in rows and conditions in columns. rownames(mat) must have at least one intersection with the elements in network .target column.

network

Tibble or dataframe with edges and it's associated metadata.

.source

Column with source nodes.

.target

Column with target nodes.

statistics

Statistical methods to be run sequentially. If none are provided, only top performer methods are run (mlm, ulm and wsum).

args

A list of argument-lists the same length as statistics (or length 1). The default argument, list(NULL), will be recycled to the same length as statistics, and will call each function with no arguments (apart from mat, network, .source and, .target).

consensus_score

Boolean whether to run a consensus score between methods.

consensus_stats

List of estimate names to use for the calculation of the consensus score. This is used to filter out extra estimations from some methods, for example wsum returns wsum, corr_wsum and norm_wsum. If none are provided, and also no statstics where provided, only top performer methods are used (mlm, ulm and norm_wsum). Else, it will use all available estimates after running all methods in the statistics argument.

include_time

Should the time per statistic evaluated be informed?

show_toy_call

The call of each statistic must be informed?

minsize

Integer indicating the minimum number of targets per source.

Value

A long format tibble of the enrichment scores for each source across the samples. Resulting tibble contains the following columns:

  1. run_id: Indicates the order in which the methods have been executed.

  2. statistic: Indicates which method is associated with which score.

  3. source: Source nodes of network.

  4. condition: Condition representing each column of mat.

  5. score: Regulatory activity (enrichment score).

  6. statistic_time: If requested, internal execution time indicator.

  7. p_value: p-value (if available) of the obtained score.

See Also

Other decoupleR statistics: run_aucell(), run_fgsea(), run_gsva(), run_mdt(), run_mlm(), run_ora(), run_udt(), run_ulm(), run_viper(), run_wmean(), run_wsum()

Examples

if (FALSE) {
    inputs_dir <- system.file("testdata", "inputs", package = "decoupleR")

    mat <- readRDS(file.path(inputs_dir, "mat.rds"))
    net <- readRDS(file.path(inputs_dir, "net.rds"))

    decouple(
        mat = mat,
        network = net,
        .source = "source",
        .target = "target",
        statistics = c("gsva", "wmean", "wsum", "ulm", "aucell"),
        args = list(
            gsva = list(verbose = FALSE),
            wmean = list(.mor = "mor", .likelihood = "likelihood"),
            wsum = list(.mor = "mor"),
            ulm = list(.mor = "mor")
        ),
        minsize = 0
    )
}

saezlab/decoupleR documentation built on Oct. 21, 2024, 8:47 a.m.