qpPAC: Estimation of partial correlation coefficients

qpPACR Documentation

Estimation of partial correlation coefficients

Description

Estimates partial correlation coefficients (PACs) for a Gaussian graphical model with undirected graph G and their corresponding p-values for the null hypothesis of zero-partial correlation.

Usage

## S4 method for signature 'ExpressionSet'
qpPAC(X, g, return.K=FALSE, tol=0.001,
                                matrix.completion=c("HTF", "IPF"), verbose=TRUE,
                                R.code.only=FALSE)
## S4 method for signature 'data.frame'
qpPAC(X, g, return.K=FALSE, long.dim.are.variables=TRUE,
                             tol=0.001, matrix.completion=c("HTF", "IPF"),
                             verbose=TRUE, R.code.only=FALSE)
## S4 method for signature 'matrix'
qpPAC(X, g, return.K=FALSE, long.dim.are.variables=TRUE,
                         tol=0.001, matrix.completion=c("HTF", "IPF"),
                         verbose=TRUE, R.code.only=FALSE)

Arguments

X

data set from where to estimate the partial correlation coefficients. It can be an ExpressionSet object, a data frame or a matrix.

g

either a qpGraph object, or a graphNEL, graphAM or graphBAM object, or an adjacency matrix of an undirected graph.

return.K

logical; if TRUE this function also returns the concentration matrix K; if FALSE it does not return it (default).

long.dim.are.variables

logical; if TRUE it is assumed that when X is a data frame or a matrix, the longer dimension is the one defining the random variables (default); if FALSE, then random variables are assumed to be at the columns of the data frame or matrix.

tol

maximum tolerance in the application of the IPF algorithm.

matrix.completion

algorithm to employ in the matrix completion operations employed to construct a positive definite matrix with the zero pattern specified in g

verbose

show progress on the calculations.

R.code.only

logical; if FALSE then the faster C implementation is used (default); if TRUE then only R code is executed.

Details

In the context of maximum likelihood estimation (MLE) of PACs it is a necessary condition for the existence of MLEs that the sample size n is larger than the clique number w(G) of the graph G. If the sample size n is larger than the maximum boundary of the input graph bd(G), then the default matrix completion algorithm HTF by Hastie, Tibshirani and Friedman (2009) can be used (see the function qpHTF() for details), which has the avantage that is faster than IPF (see the function qpIPF() for details).

The PAC estimation is done by first obtaining a MLE of the covariance matrix using the qpIPF function and the p-values are calculated based on the estimation of the standard errors (see Roverato and Whittaker, 1996) and performing Wald tests based on the asymptotic chi-squared distribution.

Value

A list with two matrices, one with the estimates of the PACs and the other with their p-values. If return.K=TRUE then the MLE of the inverse covariance is also returned as part of the list.

Author(s)

R. Castelo and A. Roverato

References

Castelo, R. and Roverato, A. A robust procedure for Gaussian graphical model search from microarray data with p larger than n. J. Mach. Learn. Res., 7:2621-2650, 2006.

Castelo, R. and Roverato, A. Reverse engineering molecular regulatory networks from microarray data with qp-graphs. J. Comp. Biol., 16(2):213-227, 2009.

Hastie, T., Tibshirani, R. and Friedman, J.H. The Elements of Statistical Learning, Springer, 2009.

Roverato, A. and Whittaker, J. Standard errors for the parameters of graphical Gaussian models. Stat. Comput., 6:297-302, 1996.

See Also

qpGraph qpCliqueNumber qpClique qpGetCliques qpIPF

Examples

require(mvtnorm)

nVar <- 50  ## number of variables
maxCon <- 5 ## maximum connectivity per variable
nObs <- 30  ## number of observations to simulate

set.seed(123)

A <- qpRndGraph(p=nVar, d=maxCon)
Sigma <- qpG2Sigma(A, rho=0.5)
X <- rmvnorm(nObs, sigma=as.matrix(Sigma))

nrr.estimates <- qpNrr(X, verbose=FALSE)

qpg <- qpGraph(nrr.estimates, epsilon=0.5)
qpg$g

pac.estimates <- qpPAC(X, g=qpg, verbose=FALSE)

## distribution absolute values of the estimated
## partial correlation coefficients of the present edges
summary(abs(pac.estimates$R[upper.tri(pac.estimates$R) & A]))

## distribution absolute values of the estimated
## partial correlation coefficients of the missing edges
summary(abs(pac.estimates$R[upper.tri(pac.estimates$R) & !A]))

rcastelo/qpgraph documentation built on Oct. 28, 2024, 5:15 a.m.