pcorSimulator: Partial Correlation Matrix simulator
In ldstatsHD: Linear Dependence Statistics for High-Dimensional Data

Description Usage Arguments Details Value Author(s) References See Also Examples

pcorSimulator creates a block diagonal positive definite precision matrix with three possible graph structures: hubs-based, power-law and random. Then, it generates samples from a multivariate normal distribution with covariance matrix given by the inverse of such precision matrix.

pcorSimulator(nobs, nclusters, nnodesxcluster, pattern = "powerLaw", 
              low.strength = 0.5, sup.strength = 0.9, nhubs = 5, 
              degree.hubs = 20, nOtherEdges = 30, alpha = 2.3, plus = 0, 
              prob = 0.05, perturb.clust = 0, mu = 0,
              probSign = 0.5, seed = sample(10000, nclusters))

`nobs`	number of observations.
`nclusters`	number of clusters or blocks of variables.
`nnodesxcluster`	number of nodes/variables per cluster.
`pattern`	graph structure pattern: name that uniquely identifies `"hubs"`, `"powerLaw"` and `"random"`.
`low.strength`	minimum magnitude for nonzero partial correlation elements before regularization.
`sup.strength`	maximum magnitude for nonzero partial correlation elements before regularization.
`nhubs`	number of hubs per cluster (if `pattern = "hubs"`).
`degree.hubs`	degree of hubs (if `pattern = "hubs"`).
`nOtherEdges`	number of edges for non-hub nodes (if `pattern = "hubs"`).
`alpha`	positive coefficient for the Riemman function in power-law distributions.
`plus`	power-law distribution added complexity (zero by default).
`prob`	probability of edge presence for random networks (if `pattern = "random"`).
`perturb.clust`	proportion of the total number of edges that are connecting two different clusters.
`mu`	expected values vector to generate data (zero by default).
`probSign`	probability of positive sign for non-zero partial correlation coefficients. Thus, negative signs are obtained with probability `1-probSign`.
`seed`	vector with seeds for each cluster.

Hubs-based networks are graphs where only few nodes have a much higher degree (or connectivity) than the rest. Power-law networks assume that the variable p_k, which denotes the fraction of nodes in the network that has degree k, is given by a power-law distribution

p_k = \frac{k^{-α}}{\varsigma(α)},

for k ≥q 1, a constant α>0 and the normalizing function \varsigma(α) which is the Riemann zeta function. Finally, random networks are also defined by the distribution in the proportion p_k. In this case, p_k follows a binomial distribution

p_k = {p\choose k} θ^k (1-θ)^{p-k},

where the parameter θ determines the proportion of edges (or sparsity) in the graph.

The regularization is given by Ω^{(1)} = Ω^{(0)} + δ I, with δ such that the condition number of Ω^{(1)} is less than the number of nodes.

An object of class pcorSim containing the following components:

`y`	generated data set.
`hubs`	hub nodes position.
`edgesInGraph`	edges given by the non-zero elements in the precision matrix.
`omega`	precision matrix used to generate the data.
`covMat`	covariance matrix used to generate the data.
`path`	adjacency matrix corresponding to the non-zero structure of `omega`.

Caballe, Adria <a.caballe@sms.ed.ac.uk>, Natalia Bochkina and Claus Mayer.

Cai, T., W. Liu, and X. Luo (2011). A Constrained L1 Minimization Approach to Sparse Precision Matrix Estimation. Journal of the American Statistical Association 106(494), 594-607.

Newman, M. (2003). The structure and function of complex networks. SIAM REVIEW 45, 167-256.

Caballe, A., N. Bochkina, and C. Mayer (2016). Selection of the Regularization Parameter in Graphical Models using network charactaristics. eprint arXiv:1509.05326, 1-25.

plot.pcorSim for graphical representation of the generated partial correlation matrix.
pcorSimulatorJoint for joint partial correlation matrix generation.

# example to use pcorSimulator function

EX1 <- pcorSimulator(nobs = 50, nclusters=3, nnodesxcluster=c(100,30,50), 
                pattern="powerLaw", plus=0)
print(EX1)
                
EX2 <- pcorSimulator(nobs = 25, nclusters=2, nnodesxcluster=c(60,40), 
                pattern="powerLaw", plus=1)
print(EX2)