run_genie3: Compute weighted adjacency matrix of inferred network

Description Usage Arguments Value References Examples

View source: R/run_genie3.R

Description

Compute weighted adjacency matrix of inferred network

Usage

1
2
3
4
run_genie3(data, regulators, targets, num_candidate_genes = "sqrt",
  num_trees = 1000, max_interactions = 1e+05,
  importance_measure = "impurity", seed = NULL, verbose = T,
  scale_genes = T, parallel_type = 1, rf_package = "ranger", ...)

Arguments

data

A data_frame of observations of the different genes. The rows must contain the observations.

regulators

A set of indices or column names of entities whose observed values regulate the observed values of the targets.

targets

A set of indices or column names of entities whose observed values are regulated by the regulators.

num_candidate_genes

The choice of number of input genes randomly selected as candidates at each node. Must be "all" for all input features, "sqrt" for the square root of all input features (default), or an integer.

num_trees

The number of trees in ensemble for each target gene (default 1000).

max_interactions

The maximum number of interactions to be returned by GENIE3.

importance_measure

Variable importance mode, one of impurity' or 'permutation'. The 'impurity' measure calculates the variance of the responses in each tree node and the 'permutation' calculates the increase of MSE after permutations of the regulators.

seed

A random number generator seed for replication of analyses. NULL means the seed is not set.

verbose

Output additional information.

scale_genes

Whether the genes should be scaled. This is recommended in order to make importance values comparable to one another.

parallel_type

Either the number of threads to use for parallel execution, or a qsub_configuration object.

rf_package

Which Random Forests implementation to use. Currently 'ranger' and 'randomForest' are supported.

...

Extra parameters to be passed to the random forest. Take note of the package used.

Value

The weighted adjacency matrix of inferred network.

References

Huynh-Thu, V. A. et al. (2010) Inferring Regulatory Networks from Expression Data Using Tree-Based Methods. PLoS ONE.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
library(GENIE3)
library(dplyr)

# generate random data
data <- dplyr::as_data_frame(matrix(runif(100 * 100), ncol = 100))
genes <- colnames(data)
regulators <- genes[1:20]
targets <- genes

true_interactions <-
  expand.grid(
    regulator = factor(regulators, levels = genes),
    target = factor(targets, levels = genes)) %>%
  sample_n(100)

# run GENIE3
ranking <- run_genie3(data, regulators, targets)

# evaluate performance
eval <- evaluate_ranking(ranking, true_interactions, regulators, targets)
eval$area_under
plot_curves(eval)

# evaluate multiple rankings at the same time
ranking_cor <- cor(data[,regulators], data[,targets]) %>%
  reshape2::melt(varnames = c("regulator", "target"), value.name = "importance") %>%
  arrange(desc(importance)) %>%
  mutate(regulator = factor(as.character(regulator), levels = genes),
         target = factor(as.character(target), levels = genes)) %>%
  filter(regulator != target)
rankings <- list(GENIE3=ranking, Correlation=ranking_cor)
evals <- evaluate_multiple_rankings(rankings, true_interactions, regulators, targets)
evals$area_under
plot_curves(evals)

# run GENIE3 in parallel
ranking <- run_genie3(data, regulators, targets, parallel_type = 8)

# run GENIE3 with PRISM
ranking <- run_genie3(data, regulators, targets, parallel_type = PRISM::override_qsub_config())

# run GENIE3 with PRISM without waiting
handle <- run_genie3(data, regulators, targets, parallel_type = PRISM::override_qsub_config(wait = F))
ranking <- retrieve_genie3_output(handle)

rcannood/GENIE3 documentation built on Jan. 28, 2021, 4:28 a.m.