learn_network: Learn network model

View source: R/learn_network.R

learn_networkR Documentation

Learn network model

Description

Learn a network model for a collection of variables.

Usage

learn_network(
  X,
  A = NA,
  method = "correlation",
  resampling_method = "stability_selection",
  numB = 100,
  cutoff = 0,
  pars = list(m = ncol(X), B = NA, alpha_stab = 0.05, alpha_pred = 0.05, size_weight =
    "linear", use_resampling = FALSE, prescreen_size = nrow(X) - 1, prescreen_type =
    "correlation", stab_test = "exact", pred_score = "mse", variable_importance =
    "scaled_coefficient"),
  verbose = 0,
  cores = 1
)

Arguments

X

data matrix. Numeric matrix of size n times d, where columns correspond to individual variables.

A

stabilizing variable. Numeric vector of length n which can be interpreted as a factor.

method

specifies which method to use. "SR" for Stabilized Regression (both standard and predictive version), "SRstab" for only the standard version of SR, "SRpred" for only the predictive version of SR, "OLS" for linear OLS regression, "lasso" for Lasso and "correlation" for correlation test.

resampling_method

specifies which resampling method to use. Should be one of "none", "stability_selection" or "permutation".

numB

number of resamples to use.

cutoff

tuning parameter used in stability selection to determine which sets count as selected.

pars

list of additional parameters passed to SR regression. See StabilizedRegression for more details.

verbose

0 for no output, 1 for text output and 2 for text and diagnostic plots.

cores

number of cores to use in resampling step.

Details

Uses StabilizedRegression, Lasso or correlation to construct a node-wise network between all variables in X.

Value

A list consisting of the following elements

Amat

adjacency matrix, where Amat[i,j] is a score (depending on the resampling_method) for the edge from i to j. For "stability_selection" scores correspond to selection probabilities, for "permutation" scores correspond to permutation p-values and for "none" scores correspond to variable importance of the method.

p

Total number of potential edges which can be used to compute upper bound on false discovery rate (only computed if resampling_method == "stability_selection").

qest

Average number of selected edges in stability selection, which can be used to compute upper bound on false discovery rate (only computed if resampling_method == "stability_selection").

If method=="SR" result is a list with two entries SRstab and SRpred each consisting of a list of the form described above.

Author(s)

Niklas Pfister

Examples

## Example
set.seed(1)
X1 <- rnorm(200)
X2 <- X1 + rnorm(200)
X3 <- 0.5 * X1 + X2 + 0.2 * c(rnorm(100), rnorm(100)+20)

X <- cbind(X1, X2, X3)
A <- as.factor(rep(c(0, 1), each=100))

network <- learn_network(X, A, method="SR", resampling_method="none")

print(network[[1]]$Amat)
print(network[[2]]$Amat)

StabilizedRegression documentation built on June 30, 2022, 9:06 a.m.