testOneBMN: One-sample testing of a binary Markov network
In jingmafdu/TestBMN: Testing for binary Markov networks

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/testOneBMN.R

testOneBMN tests whether a binary Markov network is empty, i.e. whether the corresponding partial correlation matrix Θ=0. In addition, testOneBMN simultaneously tests whether θ_{rt}=0 for each pair of edge (r,t) to recover the Markov network with false discovery rate control.

1	testOneBMN(X, lambda, alpha.global = 0.05, multiTest = TRUE, alpha.multi = 0.1)

`X`	The n x p data matrix.
`lambda`	The 1 x p vector of tuning parameters to be used in `BMN.logistic` for estimating an initial partial correlation matrix.
`alpha.global`	The significance level for global testing. Default is 0.05.
`multiTest`	Whether to conduct multiple testing of pairwise edges. Default is `TRUE`.
`alpha.multi`	The level of false discovery rate in multiple testing, necessary if `multiTest=TRUE`. Default is 0.10.

The function testOneBMN implements the one-sample testing of a binary Markov network. It calculates the standardized pairwise statistic W_{r,t} for each pair of edge (r,t) using the data X together with the tuning parameter lambda. The test statistic for the global test Θ=0 is

M_{n,p}=max_{1≤ r < t ≤ p} W_{r,t}^2.

The null Θ=0 is rejected if

M_{n,p}-4\log p+\log(\log p)≥ q_α,

where q_α is the 1-α quantile of the type I extreme value distribution with cumulative distribution function exp{-(8π)^{-1/2}e^{-z/2}}. To recover the underlying Markov network, testOneBMN considers the multiple testing problem

H_{0,r,t}: θ_{r,t}=0,1≤ r < t ≤ p.

The test statistic for each individual hypothesis H_{0,r,t} is W_{r,t}. The multiple testing procedure in testOneBMN rejects the null H_{0,r,t} if |W_{r,t}|>τ, where the threshold τ is carefully chosen to ensure false discovery rate control at the pre-specified level.

Note the tuning parameter lambda should be carefully chosen to ensure that the initial estimate of the partial correlation matrix is reasonably good. In particular, the optimal tuning parameters required in global testing and multiple testing may not be the same. See global.tune and multiple.tune.onesample for how to choose lambda. More details on the testing procedures are available in Cai et al. (2017+).

testOneBMN calls BMN.logistic to estimate an initial partial correlation matrix using nodewise \ell_1-regularized logistic regressions. As a result, testOneBMN returns a warning message if "one multinomial or binomial class has 1 or 0 observations" occurs in any of the p logisitc regressions. The user should double check the data matrix X to avoid such situations before applying testOneBMN again.

`statistic`	The test statistic.
`pvalue`	The p-value for testing the null Θ=0.
`reject`	Whether to reject the null Θ=0.
`W`	A p x p matrix consisting of the standardized pairwise statistics.
`thetaInit`	The initial estimated partial correlation matrix using `BMN.logistic`.
`theta`	The de-sparsified partial correlation matrix.
`network`	The estimated network with false discovery rate control at the pre-specified level `alpha.multi`.

Jing Ma

Cai, T. T., Li, H., Ma, J. & Xia, Y. (2017+). A testing framework for detecting differential micorbial networks.

testTwoBMN, global.tune, multiple.tune.onesample, BMN.logistic

library(glmnet)

set.seed(1)

p = 50    # number of variables
n = 100   # number of observations per replicate
n0 = 1000 # burn in tolerance
rho_high = 0.5  # signal strength 
rho_low = 0.1   # signal strength 
eps = 8/n       # tolerance for extreme proportioned observations
q = (p*(p - 1))/2

##---(1) Generate the network  
g_sf = sample_pa(p, directed=FALSE)
Amat = as.matrix(as_adjacency_matrix(g_sf, type="both"))

##---(2) Generate the Theta  
weights = matrix(0, p, p)
upperTriangle(weights) = runif(q, rho_low, rho_high) * (2*rbinom(q, 1, 0.5) - 1)
weights = weights + t(weights)
Theta = weights * Amat
dat = BMN.samples(Theta, n, n0, skip=1)
tmp = sapply(1:p, function(i) as.numeric(table(dat[,i]))[1]/n )
while(min(tmp)<eps || abs(1-max(tmp)<eps)){
  dat = BMN.samples(Theta, n, n0, skip=10)
  tmp = sapply(1:p, function(i) as.numeric(table(dat[,i]))[1]/n )
}

lambda = rep(0.05, p)
test1 = testOneBMN(dat, lambda)