run_genie3: Compute weighted adjacency matrix of inferred network
In rcannood/GENIE3: GENIE3: GEne Network Inference with Ensemble of Trees

Description Usage Arguments Value References Examples

View source: R/run_genie3.R

Compute weighted adjacency matrix of inferred network

run_genie3(data, regulators, targets, num_candidate_genes = "sqrt",
  num_trees = 1000, max_interactions = 1e+05,
  importance_measure = "impurity", seed = NULL, verbose = T,
  scale_genes = T, parallel_type = 1, rf_package = "ranger", ...)

`data`	A data_frame of observations of the different genes. The rows must contain the observations.
`regulators`	A set of indices or column names of entities whose observed values regulate the observed values of the targets.
`targets`	A set of indices or column names of entities whose observed values are regulated by the regulators.
`num_candidate_genes`	The choice of number of input genes randomly selected as candidates at each node. Must be `"all"` for all input features, `"sqrt"` for the square root of all input features (default), or an integer.
`num_trees`	The number of trees in ensemble for each target gene (default 1000).
`max_interactions`	The maximum number of interactions to be returned by GENIE3.
`importance_measure`	Variable importance mode, one of impurity' or 'permutation'. The 'impurity' measure calculates the variance of the responses in each tree node and the 'permutation' calculates the increase of MSE after permutations of the regulators.
`seed`	A random number generator seed for replication of analyses. NULL means the seed is not set.
`verbose`	Output additional information.
`scale_genes`	Whether the genes should be scaled. This is recommended in order to make importance values comparable to one another.
`parallel_type`	Either the number of threads to use for parallel execution, or a `qsub_configuration` object.
`rf_package`	Which Random Forests implementation to use. Currently 'ranger' and 'randomForest' are supported.
`...`	Extra parameters to be passed to the random forest. Take note of the package used.

The weighted adjacency matrix of inferred network.

Huynh-Thu, V. A. et al. (2010) Inferring Regulatory Networks from Expression Data Using Tree-Based Methods. PLoS ONE.

library(GENIE3)
library(dplyr)

# generate random data
data <- dplyr::as_data_frame(matrix(runif(100 * 100), ncol = 100))
genes <- colnames(data)
regulators <- genes[1:20]
targets <- genes

true_interactions <-
  expand.grid(
    regulator = factor(regulators, levels = genes),
    target = factor(targets, levels = genes)) %>%
  sample_n(100)

# run GENIE3
ranking <- run_genie3(data, regulators, targets)

# evaluate performance
eval <- evaluate_ranking(ranking, true_interactions, regulators, targets)
eval$area_under
plot_curves(eval)

# evaluate multiple rankings at the same time
ranking_cor <- cor(data[,regulators], data[,targets]) %>%
  reshape2::melt(varnames = c("regulator", "target"), value.name = "importance") %>%
  arrange(desc(importance)) %>%
  mutate(regulator = factor(as.character(regulator), levels = genes),
         target = factor(as.character(target), levels = genes)) %>%
  filter(regulator != target)
rankings <- list(GENIE3=ranking, Correlation=ranking_cor)
evals <- evaluate_multiple_rankings(rankings, true_interactions, regulators, targets)
evals$area_under
plot_curves(evals)

# run GENIE3 in parallel
ranking <- run_genie3(data, regulators, targets, parallel_type = 8)

# run GENIE3 with PRISM
ranking <- run_genie3(data, regulators, targets, parallel_type = PRISM::override_qsub_config())

# run GENIE3 with PRISM without waiting
handle <- run_genie3(data, regulators, targets, parallel_type = PRISM::override_qsub_config(wait = F))
ranking <- retrieve_genie3_output(handle)