qpCItest: Conditional independence test
In qpgraph: Estimation of genetic and molecular regulatory networks from high-throughput genomics data

Description Usage Arguments Details Value Author(s) References See Also Examples

Performs a conditional independence test between two variables given a conditioning set.

## S4 method for signature 'ExpressionSet'
qpCItest(X, i=1, j=2, Q=c(), exact.test=TRUE, use=c("complete.obs", "em"),
                                   tol=0.01, R.code.only=FALSE)
## S4 method for signature 'cross'
qpCItest(X, i=1, j=2, Q=c(), exact.test=TRUE, use=c("complete.obs", "em"),
                           tol=0.01, R.code.only=FALSE)
## S4 method for signature 'data.frame'
qpCItest(X, i=1, j=2, Q=c(), I=NULL, long.dim.are.variables=TRUE,
                                exact.test=TRUE, use=c("complete.obs", "em"), tol=0.01, R.code.only=FALSE)
## S4 method for signature 'matrix'
qpCItest(X, i=1, j=2, Q=c(), I=NULL, long.dim.are.variables=TRUE,
                            exact.test=TRUE, use=c("complete.obs", "em"), tol=0.01, R.code.only=FALSE)
## S4 method for signature 'SsdMatrix'
qpCItest(X, i=1, j=2, Q=c(), R.code.only=FALSE)

`X`	data set where the test should be performed. It can be either an `ExpressionSet` object, a `qtl::cross` object, a data frame, a matrix or an `SsdMatrix-class` object. In the latter case, the input matrix should correspond to a sample covariance matrix of data on which we want to test for conditional independence. The function `qpCov()` can be used to estimate such matrices.
`i`	index or name of one of the two variables in `X` to test.
`j`	index or name of the other variable in `X` to test.
`Q`	indexes or names of the variables in `X` forming the conditioning set.
`I`	indexes or names of the variables in `X` that are discrete. See details below regarding this argument.
`long.dim.are.variables`	logical; if TRUE it is assumed that when data are in a data frame or in a matrix, the longer dimension is the one defining the random variables (default); if FALSE, then random variables are assumed to be at the columns of the data frame or matrix.
`exact.test`	logical; if `FALSE` an asymptotic likelihood ratio test of conditional independence test is employed with mixed (i.e., continuous and discrete) data; if `TRUE` (default) then an exact likelihood ratio test of conditional independence with mixed data is employed. See details below regarding this argument.
`use`	a character string defining the way in which calculations are done in the presence of missing values. It can be either `"complete.obs"` (default) or `"em"`.
`tol`	maximum tolerance controlling the convergence of the EM algorithm employed when the argument `use="em"`.
`R.code.only`	logical; if FALSE then the faster C implementation is used (default); if TRUE then only R code is executed.

When variables in i, j and Q are continuous and I=NULL, this function performs a conditional independence test using a t-test for zero partial regression coefficient (Lauritzen, 1996, pg. 150). Note that the size of possible Q sets should be in the range 1 to min(p,n-3), where p is the number of variables and n the number of observations. The computational cost increases linearly with the number of variables in Q.

When variables in i, j and Q are continuous and discrete (mixed data), indicated with the I argument when X is a matrix, then mixed graphical model theory (Lauritzen and Wermuth, 1989) is employed and, concretely, it is assumed that data come from an homogeneous conditional Gaussian distribution. By default, with exact.test=TRUE, an exact likelihood ratio test for conditional independence is performed (Lauritzen, 1996, pg. 192-194; Tur, Roverato and Castelo, 2014), otherwise an asymptotic one is used.

In this setting further restrictions to the maximum value of q apply, concretely, it cannot be smaller than p plus the number of levels of the discrete variables involved in the marginal distributions employed by the algorithm.

A list with class "htest" containing the following components:

`statistic`	in case of pure continuous data and `I=NULL`, the t-statistic for zero partial regression coefficient; when `I!=NULL`, the value `Lambda` of the likelihood ratio if `exact.test=TRUE` and `-n log Lambda` otherwise.
`parameter`	in case of pure continuous data and `I=NULL`, the degrees of freedom for the t-statistic (`n-q-2`); when `I!=NULL`, the degrees of freedom for `-n log Lambda` of a chi-square distribution under the null hypothesis if `exact.test=FALSE` and the `(a, b)` parameters of a beta distribution under the null if `exact.test=TRUE`.
`p.value`	the p-value for the test.
`estimate`	in case of pure continuous data (`I=NULL`), the estimated partial regression coefficient. In case of mixed continuous and discrete data with `I!=NULL`, the estimated partial eta-squared: the fraction of variance from `i` or `j` explained by the other tested variable after excluding the variance explained by the variables in `Q`. If one of the tested variables `i` or `j` is discrete, then the partial eta-squared is calculated on the tested continuous variable. If both, `i` and `j` are continuous, then the partial eta-squared is calculated on variable `i`.
`alternative`	a character string describing the alternative hypothesis.
`method`	a character string indicating what type of conditional independence test was performed.
`data.name`	a character string giving the name(s) of the random variables involved in the conditional independence test.

R. Castelo and A. Roverato

Castelo, R. and Roverato, A. A robust procedure for Gaussian graphical model search from microarray data with p larger than n, J. Mach. Learn. Res., 7:2621-2650, 2006.

Lauritzen, S.L. Graphical models. Oxford University Press, 1996.

Lauritzen, S.L and Wermuth, N. Graphical Models for associations between variables, some of which are qualitative and some quantitative. Ann. Stat., 17(1):31-57, 1989.

Tur, I., Roverato, A. and Castelo, R. Mapping eQTL networks with mixed graphical Markov models. Genetics, 198:1377-1393, 2014.

qpCov qpNrr qpEdgeNrr

require(mvtnorm)

nObs <- 100 ## number of observations to simulate

## the following adjacency matrix describes an undirected graph
## where vertex 3 is conditionally independent of 4 given 1 AND 2
A <- matrix(c(FALSE,  TRUE,  TRUE,  TRUE,
              TRUE,  FALSE,  TRUE,  TRUE,
              TRUE,   TRUE, FALSE, FALSE,
              TRUE,   TRUE, FALSE, FALSE), nrow=4, ncol=4, byrow=TRUE)
Sigma <- qpG2Sigma(A, rho=0.5)

X <- rmvnorm(nObs, sigma=as.matrix(Sigma))

qpCItest(X, i=3, j=4, Q=1, long.dim.are.variables=FALSE)

qpCItest(X, i=3, j=4, Q=c(1,2), long.dim.are.variables=FALSE)