qpPAC: Estimation of partial correlation coefficients
In rcastelo/qpgraph: Estimation of Genetic and Molecular Regulatory Networks from High-Throughput Genomics Data

qpPAC

R Documentation

Estimation of partial correlation coefficients

Description

Estimates partial correlation coefficients (PACs) for a Gaussian graphical model with undirected graph G and their corresponding p-values for the null hypothesis of zero-partial correlation.

Usage

## S4 method for signature 'ExpressionSet'
qpPAC(X, g, return.K=FALSE, tol=0.001,
                                matrix.completion=c("HTF", "IPF"), verbose=TRUE,
                                R.code.only=FALSE)
## S4 method for signature 'data.frame'
qpPAC(X, g, return.K=FALSE, long.dim.are.variables=TRUE,
                             tol=0.001, matrix.completion=c("HTF", "IPF"),
                             verbose=TRUE, R.code.only=FALSE)
## S4 method for signature 'matrix'
qpPAC(X, g, return.K=FALSE, long.dim.are.variables=TRUE,
                         tol=0.001, matrix.completion=c("HTF", "IPF"),
                         verbose=TRUE, R.code.only=FALSE)

Arguments

`X`	data set from where to estimate the partial correlation coefficients. It can be an ExpressionSet object, a data frame or a matrix.
`g`	either a `qpGraph` object, or a `graphNEL`, `graphAM` or `graphBAM` object, or an adjacency matrix of an undirected graph.
`return.K`	logical; if TRUE this function also returns the concentration matrix `K`; if FALSE it does not return it (default).
`long.dim.are.variables`	logical; if TRUE it is assumed that when `X` is a data frame or a matrix, the longer dimension is the one defining the random variables (default); if FALSE, then random variables are assumed to be at the columns of the data frame or matrix.
`tol`	maximum tolerance in the application of the IPF algorithm.
`matrix.completion`	algorithm to employ in the matrix completion operations employed to construct a positive definite matrix with the zero pattern specified in `g`
`verbose`	show progress on the calculations.
`R.code.only`	logical; if FALSE then the faster C implementation is used (default); if TRUE then only R code is executed.

Details

In the context of maximum likelihood estimation (MLE) of PACs it is a necessary condition for the existence of MLEs that the sample size n is larger than the clique number w(G) of the graph G. If the sample size n is larger than the maximum boundary of the input graph bd(G), then the default matrix completion algorithm HTF by Hastie, Tibshirani and Friedman (2009) can be used (see the function qpHTF() for details), which has the avantage that is faster than IPF (see the function qpIPF() for details).

The PAC estimation is done by first obtaining a MLE of the covariance matrix using the qpIPF function and the p-values are calculated based on the estimation of the standard errors (see Roverato and Whittaker, 1996) and performing Wald tests based on the asymptotic chi-squared distribution.

Value

A list with two matrices, one with the estimates of the PACs and the other with their p-values. If return.K=TRUE then the MLE of the inverse covariance is also returned as part of the list.

Author(s)

R. Castelo and A. Roverato

References

Castelo, R. and Roverato, A. A robust procedure for Gaussian graphical model search from microarray data with p larger than n. J. Mach. Learn. Res., 7:2621-2650, 2006.

Castelo, R. and Roverato, A. Reverse engineering molecular regulatory networks from microarray data with qp-graphs. J. Comp. Biol., 16(2):213-227, 2009.

Hastie, T., Tibshirani, R. and Friedman, J.H. The Elements of Statistical Learning, Springer, 2009.

Roverato, A. and Whittaker, J. Standard errors for the parameters of graphical Gaussian models. Stat. Comput., 6:297-302, 1996.

Examples

require(mvtnorm)

nVar <- 50  ## number of variables
maxCon <- 5 ## maximum connectivity per variable
nObs <- 30  ## number of observations to simulate

set.seed(123)

A <- qpRndGraph(p=nVar, d=maxCon)
Sigma <- qpG2Sigma(A, rho=0.5)
X <- rmvnorm(nObs, sigma=as.matrix(Sigma))

nrr.estimates <- qpNrr(X, verbose=FALSE)

qpg <- qpGraph(nrr.estimates, epsilon=0.5)
qpg$g

pac.estimates <- qpPAC(X, g=qpg, verbose=FALSE)

## distribution absolute values of the estimated
## partial correlation coefficients of the present edges
summary(abs(pac.estimates$R[upper.tri(pac.estimates$R) & A]))

## distribution absolute values of the estimated
## partial correlation coefficients of the missing edges
summary(abs(pac.estimates$R[upper.tri(pac.estimates$R) & !A]))

rcastelo/qpgraph documentation built on June 14, 2025, 6:39 p.m.