createS: Simulate sample covariances or datasets

Description Usage Arguments Details Value Author(s) Examples

View source: R/rags2ridgesFused.R

Description

Simulate data from a p-dimensional (zero-mean) gaussian graphical model (GGM) with a specified (or random) topology and return the sample covariance matrix or matrices. Can also return the original simulated data.

Usage

1
2
3
4
5
createS(n, p,
        topology = "identity",  # See details for other choices
        dataset = FALSE, precision = FALSE,
        nonzero = 0.25, m = 1L, banded.n = 2L,
        invwishart = FALSE, nu = p + 1, Plist)

Arguments

n

A numeric vector giving number of samples. If the length is larger than 1, the covariance matrices are returned as a list.

p

A numeric of length 1 giving the dimension of the samples/covariance.

topology

character. The topology to use for the simulations. See the details.

dataset

A logical value specifying whether the sample covariance or the simulated data itself should be returned.

precision

A logical value. If TRUE the constructed precision matrix is returned.

nonzero

A numeric of length 1 giving the value of the nonzero entries used in some topologies.

m

A integer giving the number of blocks (i.e. conditionally independent components) to create. If m is greater than 1, then the given topology is used on m blocks of approximately equal size.

banded.n

A integer of length one giving the number of bands. Only used if topology is one of "banded", "small-world", or "Watts-Strogatz".

invwishart

logical. If TRUE the constructed precision matrix is used as the scale matrix of an inverse Wishart distribution and class covariance matrices are drawn from this distribution.

nu

numeric greater than p + 1 giving the degrees of freedom in the inverse Wishart distribution. A large nu implies high class homogeneity. A small nu near p + 1 implies high class heterogeneity.

Plist

An optional list of numeric matrices giving the precision matrices to simulate from. Useful when random matrices have already been generated by setting precision = TRUE.

Details

The data is simulated from a zero-mean p-dimensional multivariate gaussian distribution with some precision matrix determined by the argument topology which defines the GGM. If precision is TRUE the population precision matrix is returned. This is useful to see what the actual would-be-used precision matrices are. The available values of topology are described below. Unless otherwise stated the diagonal entries are always one. If m is 2 or greater block diagonal precision matrices are constructed and used.

When n has length greater than 1, the datasets are generated i.i.d. given the topology and number of blocks.

Arguments invwishart and nu allows for introducing class homogeneity. Large values of nu imply high class homogeneity. nu must be greater than p + 1. More precisely, if invwishart == TRUE then the constructed precision matrix is used as the scale parameter in an inverse Wishart distribution with nu degrees of freedom. Each class covariance is distributed according to this inverse Wishart and independent.

Value

The returned type is dependent on n and covariance. The function generally returns a list of numeric matrices with the same length as n. If covariance is FALSE the simulated datasets with size n[i] by p are given in the i entry of the output. If covariance is TRUE the p by p sample covariances of the datasets are given. When n has length 1 the list structure is dropped and the matrix is returned.

Author(s)

Anders E. Bilgrau, Carel F.W. Peeters <[email protected]>, Wessel N. van Wieringen

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
## Generate some simple sample covariance matrices
createS(n = 10, p = 3)
createS(n = c(3, 4, 5), p = 3)
createS(n = c(32, 55), p = 7)

## Generate some datasets and not sample covariance matrices
createS(c(3, 4), p = 6, dataset = TRUE)

## Generate sample covariance matrices from other topologies:
A <- createS(2000, p = 4, topology = "star")
round(solve(A), 3)
B <- createS(2000, p = 4, topology = "banded", banded.n = 2)
round(solve(B), 3)
C <- createS(2000, p = 4, topology = "clique")  # The complete graph (as m = 1)
round(solve(C), 3)
D <- createS(2000, p = 4, topology = "chain")
round(solve(D), 3)

## Generate smaple covariance matrices from block topologies:
C3 <- createS(2000, p = 10, topology = "clique", m = 3)
round(solve(C3), 1)
C5 <- createS(2000, p = 10, topology = "clique", m = 5)
round(solve(C5), 1)

## Can also return the precision matrix to see what happens
## m = 2 blocks, each "banded" with 4 off-diagonal bands
round(createS(1, 12, "banded", m = 2, banded.n = 4, precision = TRUE), 2)

## Simulation using graph-games
round(createS(1, 10, "small-world", precision = TRUE), 2)
round(createS(1, 5, "scale-free", precision = TRUE), 2)
round(createS(1, 5, "random-graph", precision = TRUE), 2)

## Simulation using inverse Wishart distributed class covariance
## Low class homogeneity
createS(n = c(10,10), p = 5, "banded", invwishart = TRUE, nu = 10)
## Extremely high class homogeneity
createS(n = c(10,10), p = 5, "banded", invwishart = TRUE, nu = 1e10)

# The precision argument can again be used to see the actual realised class
# precision matrices used when invwishart = TRUE.

# The Plist argument is used to reuse old precision matrices or
# user-generated ones
P <- createS(n = 1, p = 5, "banded", precision = TRUE)
lapply(createS(n = c(1e5, 1e5), p = 5, Plist = list(P, P+1)), solve)

CFWP/rags2ridges documentation built on Sept. 23, 2017, 6:38 a.m.