ppca2Net: Network reconstruction from PPCA

Description Usage Arguments Details Value References See Also Examples

View source: R/ppca2Net.R

Description

Constructs a conditional independence network of the observed variables from the data using the implicitly estimated covariance matrix within PPCA.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
ppca2Net(
  ppcaOutput,
  plot = TRUE,
  verbose = TRUE,
  vertex.size = 10,
  edge.width = 2,
  vertex.label.cex = 0.4,
  vertex.color = "cyan",
  vertex.label.color = "black",
  edge.color = "pink",
  vertex.label.family = "Helvetica",
  vertex.label = NULL
)

Arguments

ppcaOutput

list – the output object from running any of the PPCA functions in this package.

plot

logical – visualise the resulting network.

verbose

logical – verbose intermediary output.

vertex.size

see igraph.plotting

edge.width

see igraph.plotting

vertex.label.cex

see igraph.plotting

vertex.color

see igraph.plotting

vertex.label.color

see igraph.plotting

edge.color

see igraph.plotting

vertex.label.family

see igraph.plotting

vertex.label

see igraph.plotting

Details

Covariance estimation is done as a preliminary step for this function. The function then inverts this matrix, which can be done very efficiently, to obtain the precision matrix. Then the precision matrix is scaled to unit variance (diagonal) to obtain partial correlation estimates in the off-diagonal entries, which is a measure of conditional independence. A two component mixture model is then fit to the distribution of partial correlations using fdrtool. The partial correlations that are not part of the 'null' component are then selected as true edges of the network, effectively setting the null values to 0. The function then visualises the resulting network using plot.igraph. The user can extract the fdr.stats element of this output to view the full output of fdrtool, from which the magnitude and significance of each partial correlation can be seen (and customised thresholding can be performed). The graph element of the output is an ‘igraph’ class, and so can be used to easily make alternative visualisations or compute graph statistics.

Value

A list of 2 elements:

graph

igraph’ – Contains the network information.

fdr.stats

list – the full output of an internal call to fdrtool. Can be useful to inspect the statistics upon which the network was reconstructed.

References

Strimmer, K., 2008. link.

Strimmer, K., 2008. doi.

Csardi, G. and Nepusz, T., 2006. link.

See Also

igraph, fdrtool

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#' # simulate a dataset from a zero mean factor model X = Wz + epsilon
# start off by generating a random binary connectivity matrix
n.factors <- 5
n.genes <- 200
# with dense connectivity
# set.seed(20)
conn.mat <- matrix(rbinom(n = n.genes*n.factors,
                          size = 1, prob = 0.7), c(n.genes, n.factors))

# now generate a loadings matrix from this connectivity
loading.gen <- function(x){
  ifelse(x==0, 0, rnorm(1, 0, 1))
}

W <- apply(conn.mat, c(1, 2), loading.gen)

# generate factor matrix
n.samples <- 100
z <- replicate(n.samples, rnorm(n.factors, 0, 1))

# generate a noise matrix
sigma.sq <- 0.1
epsilon <- replicate(n.samples, rnorm(n.genes, 0, sqrt(sigma.sq)))

# by the ppca equations this gives us the data matrix
X <- W%*%z + epsilon
WWt <- tcrossprod(W)
Sigma <- WWt + diag(sigma.sq, n.genes)

# select 10% of entries to make missing values
missFrac <- 0.1
inds <- sample(x = 1:length(X),
               size = ceiling(length(X)*missFrac),
               replace = FALSE)

# replace them with NAs in the dataset
missing.dataset <- X
missing.dataset[inds] <- NA

# run ppca
ppf <- pca_full(missing.dataset, ncomp=5, algorithm="vb", maxiters=5,
bias=TRUE, rotate2pca=FALSE, loglike=TRUE, verbose=TRUE)

# compute the network
pcanet <- ppca2Net(ppf, plot=TRUE)

HGray384/pcaNet documentation built on Nov. 14, 2020, 11:11 a.m.