learn_network: Learn network model
In StabilizedRegression: Stabilizing Regression and Variable Selection

View source: R/learn_network.R

learn_network

R Documentation

Learn network model

Description

Learn a network model for a collection of variables.

Usage

learn_network(
  X,
  A = NA,
  method = "correlation",
  resampling_method = "stability_selection",
  numB = 100,
  cutoff = 0,
  pars = list(m = ncol(X), B = NA, alpha_stab = 0.05, alpha_pred = 0.05, size_weight =
    "linear", use_resampling = FALSE, prescreen_size = nrow(X) - 1, prescreen_type =
    "correlation", stab_test = "exact", pred_score = "mse", variable_importance =
    "scaled_coefficient"),
  verbose = 0,
  cores = 1
)

Arguments

`X`	data matrix. Numeric matrix of size n times d, where columns correspond to individual variables.
`A`	stabilizing variable. Numeric vector of length n which can be interpreted as a factor.
`method`	specifies which method to use. "SR" for Stabilized Regression (both standard and predictive version), "SRstab" for only the standard version of SR, "SRpred" for only the predictive version of SR, "OLS" for linear OLS regression, "lasso" for Lasso and "correlation" for correlation test.
`resampling_method`	specifies which resampling method to use. Should be one of "none", "stability_selection" or "permutation".
`numB`	number of resamples to use.
`cutoff`	tuning parameter used in stability selection to determine which sets count as selected.
`pars`	list of additional parameters passed to SR regression. See StabilizedRegression for more details.
`verbose`	0 for no output, 1 for text output and 2 for text and diagnostic plots.
`cores`	number of cores to use in resampling step.

Details

Uses StabilizedRegression, Lasso or correlation to construct a node-wise network between all variables in X.

Value

A list consisting of the following elements

`Amat`	adjacency matrix, where Amat[i,j] is a score (depending on the resampling_method) for the edge from i to j. For "stability_selection" scores correspond to selection probabilities, for "permutation" scores correspond to permutation p-values and for "none" scores correspond to variable importance of the method.
`p`	Total number of potential edges which can be used to compute upper bound on false discovery rate (only computed if resampling_method == "stability_selection").
`qest`	Average number of selected edges in stability selection, which can be used to compute upper bound on false discovery rate (only computed if resampling_method == "stability_selection").

If method=="SR" result is a list with two entries SRstab and SRpred each consisting of a list of the form described above.

Author(s)

Niklas Pfister

Examples

## Example
set.seed(1)
X1 <- rnorm(200)
X2 <- X1 + rnorm(200)
X3 <- 0.5 * X1 + X2 + 0.2 * c(rnorm(100), rnorm(100)+20)

X <- cbind(X1, X2, X3)
A <- as.factor(rep(c(0, 1), each=100))

network <- learn_network(X, A, method="SR", resampling_method="none")

print(network[[1]]$Amat)
print(network[[2]]$Amat)

StabilizedRegression documentation built on June 30, 2022, 9:06 a.m.