elasticNetSEMcv: The Elastic Net penalty for SEM with user supplied (alphas,...

View source: R/cvElasticNetSEM.R

elasticNetSEMcvR Documentation

The Elastic Net penalty for SEM with user supplied (alphas, lambdas) for grid search

Description

Function elasticNetSEMcv allows users to set their own grid search through combination of a set of user provided alphas an lambdas.

Usage

elasticNetSEMcv(Y, X, Missing, B, alpha_factors,lambda_factors,kFold, verbose)

Arguments

Y

The observed node response data with dimension of M (nodes) by N (samples). Y is normalized inside the function.

X

The network node attribute matrix with dimension of M by N. Theoretically, X can be L by N matrix, with L being the total node attributes. In current implementation, each node only allows one and only one attribute.
If you have more than one attributes for some nodes, please consider selecting the top one by either correlation or principal component methods.
If for some nodes there is no attribute available, fill in the rows with all zeros. See the yeast data 'yeast.rda' for example.
X is normalized inside the function.

Missing

Optional M by N matrix corresponding to elements of Y. 0 denotes not missing, and 1 denotes missing. If a node i in sample j has the label missing (Missing[i,j] = 1), then Y[i,j] is set to 0.

B

Optional input. For a network with M nodes, B is the M by M adjacency matrix. If data is simulated/with known true network topology (i.e., known adjacency matrix), the Power of detection (PD) and False Discovery Rate (FDR) is computed in the output parameter 'statistics'.

If the true network topology is unknown, B is optional, and the PD/FDR in output parameter 'statistics' should be ignored.

alpha_factors

The set of candidate alpha values. Default is seq(start = 0.95, to = 0.05, step = -0.05)

lambda_factors

The set of candidate lambda values. Default is 10^seq(start =1, to = 0.001, step = -0.2)

kFold

k-fold cross validation, default k=5

verbose

describe the information output from -1 - 10, larger number means more output

Details

the function perform CV and parameter inference, calculate power and FDR

Value

cv

dataframe stores the minimum Mean Square Error (MSE) for each alpha and the corresponding lambda from the selection path [lambda_max, ...., lambda_min].
col1: alpha
col2: lambda (With the given alpha, this is the lambda having minimum MSE)
col3: MSE
col4: STE

The final (alpha, lambda) is set at the (alpha, lambda) that is within 1ste of the min(MSE) with higher level of penalty on the likehood function.

  • fitthe model fit with optimal (alpha,lambda) from cv

    • Bout the computed weights for the network topology. B[i,j] = 0 means there is no edge between node i and j; B[i,j]!=0 denotes an (undirected) edge between note i and j with B[i,j] being the weight of the edge.

    • fout f is 1 by M array keeping the weight for X (in SEM: Y = BY + FX + e). Theoretically, F can be M by L matrix, with M being the number of nodes, and L being the total node attributes. However, in current implementation, each node only allows one and only one attribute. If you have more than one attributes for some nodes, please consider selecting the top one by either correlation or principal component methods.

    • stat statistics is 1x6 array keeping record of:
      1. correct positive
      2. total positive
      3. false positive
      4. positive detected
      5. Power of detection (PD) = correct positive/total positive
      6. False Discovery Rate (FDR) = false positive/positive detected

    • simTimecomputational time

    • callthe call that produced this object

Note

Difference in three functions:
1) elasticNetSML: Default alpha = 0.95: -0.05: 0.05; default 20 lambdas
2) elasticNetSEMcv: user supplied alphas (one or more), lambdas; compute the optimal parameters and network parameters
3) elasticNetSMLpoint: user supplied one alpha and one lambda, compute the network parameters

User is responsible to set the random seed to guarantee repeatable results.

Author(s)

Anhui Huang; Dept of Electrical and Computer Engineering, Univ of Miami, Coral Gables, FL

References

1. Cai, X., Bazerque, J.A., and Giannakis, G.B. (2013). Inference of Gene Regulatory Networks with Sparse Structural Equation Models Exploiting Genetic Perturbations. PLoS Comput Biol 9, e1003068.
2. Huang, A. (2014). "Sparse model learning for inferring genotype and phenotype associations." Ph.D Dissertation. University of Miami(1186).


Examples

	library(sparseSEM)
	data(B);
	data(Y);
	data(X);
	data(Missing);
	## Not run: OUT <- elasticNetSEMcv(Y, X, Missing, B, alpha_factors = c(0.75, 0.5, 0.25),
	lambda_factors=c(0.1, 0.01, 0.001), kFold = 5, verbose  = 1);

## End(Not run)

sparseSEM documentation built on Aug. 9, 2023, 5:07 p.m.