joint partial correlation matrices simulator

Share:

Description

pcorSimulatorJointPaired creates two similar positive definite precision matrices with three possible graph structures: hubs-based, power-law and random. Moreover, it allows for three types of differential graph structures: random differences, clustered differences or a mixture of the two. Then, it generates (dependent) datasets from a multivariate normal distribution defined by the inverse of such precision matrices.

Usage

1
2
3
4
5
6
7
8
pcorSimulatorJoint(nobs, nclusters, nnodesxcluster, pattern = "hubs", 
                   diffType = "cluster", dataDepend = "ind", low.strength = 0.5, 
                   sup.strength = 0.9, pdiff = 0, nhubs = 5, degree.hubs = 20,  
                   nOtherEdges = 30, alpha = 2.3, plus = 0, prob = 0.05, 
                   perturb.clust = 0, mu = 0, diagCCtype = "dicot", 
                   diagNZ.strength = .5, mixProb = 0.5, probSign = 0.5,  
                   exactZeroTh = 0.05, seed = sample(10000,nclusters+2))
                  

Arguments

nobs

number of observations.

nclusters

number of clusters or blocks of variables.

nnodesxcluster

number of nodes/variables per cluster.

pattern

graph structure pattern: name that uniquely identifies "hubs", "power" and "random".

diffType

pattern in differential edges: name that uniquely identifies "random", "cluster" or "mixed".

dataDepend

model used to describe the dependent structure for the data: name that uniquely identifies "ind" (no dependence), "diagOmega", "mult" or "add".

low.strength

minimum magnitude for nonzero partial correlation elements before regularization.

sup.strength

maximum magnitude for nonzero partial correlation elements before regularization.

pdiff

proportion of differential edges from the total number edges in each graph.

nhubs

number of hubs per cluster (if pattern = "hubs").

degree.hubs

degree of hubs (if pattern = "hubs").

nOtherEdges

number of edges for non-hub nodes (if pattern = "hubs").

alpha

positive coefficient for the Riemman function in power-law distributions.

plus

power-law distribution added complexity (zero by default).

prob

probability of edge existence for random networks (if pattern="random").

perturb.clust

proportion of the total number of edges that are connecting two different clusters.

mu

expected values vector to generate data (zero by default).

diagCCtype

way to generate diagonal values of either cross partial correlation matrix (if dataDepend = "diagOmega") or cross correlation matrix (if dataDepend = "mult" or dataDepend = "add"): name that uniquely identifies "dicot" or "beta13" (see details).

diagNZ.strength

magnitude for the non-zero elements in the diagonal of the cross (partial) correlation when diagCCtype = "dicot".

mixProb

proportion of random differential connections if diffType = "mixed". The remaining connections are given by a cluster type.

probSign

probability of positive sign for non-zero partial correlation coefficients. Thus, negative signs are obtained with probability 1-probSign.

exactZeroTh

partial correlation coefficients smaller than exactZeroTh are considered exact zeros.

seed

vector with seeds for each cluster.

Details

First, pcorSimulator is used to create a common precision matrix among the two populations. Then, differential edges are added based on the next two patterns: Cluster - a graph cluster is zero in one condition and non-zero in the other condition; Random - differential connections are given randomly in the graph.

Paired structure is defined by arguments dataDepend and diagCCtype. Additive (dataDepend = "add") and multiplicative (dataDepend = "mult") models are used on the cross-covariance matrix such that Σ_{XY} = Δ Σ_X Δ^t, with diagonal matrix Δ, 0≤qΔ_{ii}<1 and Σ_{XY} = ΔΣ_X^{1/2}Σ_Y^{1/2} Δ^t respectively where diagonal coefficients in Δ are defined by diagCCtype. A simplification is also considered by assuming that variables in one data set are only conditionally dependent to the same variables of the other data set, hence assuming a diagonal structure in the cross joint partial correlation matrix that can also be defined by Δ. For the three models, In case diagCCtype = "dicot" the diagonal elements in Δ have zero/non-zero structure (with non-zero coefficients given in the parameter Δ). In case diagCCtype = "beta13" the diagonal elements are generated by a beta distribution with shape parameter equal to 1 and scale parameter equal to 3.

Value

An object of class pcorSimJoint containing the following components:

D1

dataset for first population.

D2

dataset for second population.

omega1

precision matrix for first population.

omega2

precision matrix for second population.

P

total number of variables.

diffs

differential edges.

delta

generated values for the dependent structure.

covJ

joint covariance matrix used to generate the data.

path1

adjacency matrix corresponding to the non-zero structure of omega1.

path2

adjacency matrix corresponding to the non-zero structure of omega2.

Author(s)

Caballe, Adria <a.caballe@sms.ed.ac.uk>, Natalia Bochkina and Claus Mayer.

References

Cai, T., W. Liu, and X. Luo (2011). A Constrained L1 Minimization Approach to Sparse Precision Matrix Estimation. Journal of the American Statistical Association 106(494), 594-607.

Newman, M. (2003). The structure and function of complex networks. SIAM REVIEW 45, 167-256.

Wit, E. and A. Abbruzzo (2015, feb). Factorial graphical models for dynamic networks. Network Science 3(01), 37-57.

Caballe, A., N. Bochkina, and C. Mayer (2016). Selection of the Regularization Parameter in Graphical Models using network charactaristics. eprint arXiv:1509.05326, 1-25.

See Also

pcorSimulator for precision matrix generator.
plot.pcorSimJoint for plotting joint partial correlation matrices.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# example to use pcorSimulatorJoint function
EX1 <- pcorSimulatorJoint(nobs = 50, nclusters = 2, nnodesxcluster = c(30, 40), 
                          pattern = "pow", diffType = "cluster", dataDepend = "ind", 
                          pdiff = 0.2, diagCCtype = "dicot", diagNZ.strength = .5)
print(EX1)

EX2 <- pcorSimulatorJoint(nobs = 50, nclusters = 2, nnodesxcluster = c(30, 40), 
                          pattern = "pow", diffType = "rand", dataDepend = "diag", 
                          pdiff = 0.05, diagCCtype = "beta")
print(EX2)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.