Partial Correlation Matrix simulator

Share:

Description

pcorSimulator creates a block diagonal positive definite precision matrix with three possible graph structures: hubs-based, power-law and random. Then, it generates samples from a multivariate normal distribution with covariance matrix given by the inverse of such precision matrix.

Usage

1
2
3
4
5
pcorSimulator(nobs, nclusters, nnodesxcluster, pattern = "powerLaw", 
              low.strength = 0.5, sup.strength = 0.9, nhubs = 5, 
              degree.hubs = 20, nOtherEdges = 30, alpha = 2.3, plus = 0, 
              prob = 0.05, perturb.clust = 0, mu = 0,
              probSign = 0.5, seed = sample(10000, nclusters))

Arguments

nobs

number of observations.

nclusters

number of clusters or blocks of variables.

nnodesxcluster

number of nodes/variables per cluster.

pattern

graph structure pattern: name that uniquely identifies "hubs", "powerLaw" and "random".

low.strength

minimum magnitude for nonzero partial correlation elements before regularization.

sup.strength

maximum magnitude for nonzero partial correlation elements before regularization.

nhubs

number of hubs per cluster (if pattern = "hubs").

degree.hubs

degree of hubs (if pattern = "hubs").

nOtherEdges

number of edges for non-hub nodes (if pattern = "hubs").

alpha

positive coefficient for the Riemman function in power-law distributions.

plus

power-law distribution added complexity (zero by default).

prob

probability of edge presence for random networks (if pattern = "random").

perturb.clust

proportion of the total number of edges that are connecting two different clusters.

mu

expected values vector to generate data (zero by default).

probSign

probability of positive sign for non-zero partial correlation coefficients. Thus, negative signs are obtained with probability 1-probSign.

seed

vector with seeds for each cluster.

Details

Hubs-based networks are graphs where only few nodes have a much higher degree (or connectivity) than the rest. Power-law networks assume that the variable p_k, which denotes the fraction of nodes in the network that has degree k, is given by a power-law distribution

p_k = \frac{k^{-α}}{\varsigma(α)},

for k ≥q 1, a constant α>0 and the normalizing function \varsigma(α) which is the Riemann zeta function. Finally, random networks are also defined by the distribution in the proportion p_k. In this case, p_k follows a binomial distribution

p_k = {p\choose k} θ^k (1-θ)^{p-k},

where the parameter θ determines the proportion of edges (or sparsity) in the graph.

The regularization is given by Ω^{(1)} = Ω^{(0)} + δ I, with δ such that the condition number of Ω^{(1)} is less than the number of nodes.

Value

An object of class pcorSim containing the following components:

y

generated data set.

hubs

hub nodes position.

edgesInGraph

edges given by the non-zero elements in the precision matrix.

omega

precision matrix used to generate the data.

covMat

covariance matrix used to generate the data.

path

adjacency matrix corresponding to the non-zero structure of omega.

Author(s)

Caballe, Adria <a.caballe@sms.ed.ac.uk>, Natalia Bochkina and Claus Mayer.

References

Cai, T., W. Liu, and X. Luo (2011). A Constrained L1 Minimization Approach to Sparse Precision Matrix Estimation. Journal of the American Statistical Association 106(494), 594-607.

Newman, M. (2003). The structure and function of complex networks. SIAM REVIEW 45, 167-256.

Caballe, A., N. Bochkina, and C. Mayer (2016). Selection of the Regularization Parameter in Graphical Models using network charactaristics. eprint arXiv:1509.05326, 1-25.

See Also

plot.pcorSim for graphical representation of the generated partial correlation matrix.
pcorSimulatorJoint for joint partial correlation matrix generation.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# example to use pcorSimulator function

EX1 <- pcorSimulator(nobs = 50, nclusters=3, nnodesxcluster=c(100,30,50), 
                pattern="powerLaw", plus=0)
print(EX1)
                
EX2 <- pcorSimulator(nobs = 25, nclusters=2, nnodesxcluster=c(60,40), 
                pattern="powerLaw", plus=1)
print(EX2)
 

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.