TestBMN-package: Testing for binary Markov networks

Description Details Author(s) References Examples

Description

The TestBMN-package provides a testing framework for detecting differential binary Markov networks in the case of two-sample testing, or one binary Markov network in the case of one-sample testing.

Details

The DESCRIPTION file: This package was not yet installed at build time.
Index: This package was not yet installed at build time.

Binary Markov networks are commonly used to model the conditional independence relationships between the components of a random vector. Specifically, let X be a p-dimensional random vector. The pairwise relationships in X can be captured by an undirected graph G=(V,E), whose vertex set V={1,…,p} and edge set E corresponds to conditional dependence. The binary Markov network associated with G has the joint distribution

p(X) \propto \exp≤ft{∑_{(r,t)\in E} θ_{r,t} X_r X_t \right},

subject to a normalizing constant.

Given n_1 i.i.d. observations from one population and n_2 i.i.d. observations from another popultation, it is often of interest to test whether the two underlying Markov networks are the same. Let Θ_1 and Θ_2 be the two unknown partial correlation matrices from the two populations. It suffices to test

H_0: Θ_1=Θ_2 \mbox{ v.s. } H_1: Θ_1 \ne Θ_2.

If the global null H_0 is rejected, it becomes of interest to test

H_{0,r,t}: θ_{r,t,1}-θ_{r,t,2}=0 \mbox{ v.s. } H_{1,r,t}: θ_{r,t,1}-θ_{r,t,2}\ne 0, 1≤ r < t ≤ p,

so as to recover the differential network Θ_1 - Θ_2.

The TestBMN-package provides a rigorous framework for the above two-sample global and multiple testing problems via testTwoBMN. The global testing procedure in testTwoBMN is particularly powerful against alternatives where the differential network Θ_1 - Θ_2 is sparse. The multiple testing procedure testTwoBMN provides asymptotic control of the false discovery proportion and the false discovery rate under suitable conditions. In addition, one can similarly conduct one-sample global and multiple testing problems via testOneBMN, if the goal is to estimate a binary Markov network with false discovery rate control.

More details about the method implemented in the TestBMN-package are available in Cai et al. (2017+).

Author(s)

Jing Ma <crystal.jing.ma@gmail.com>

References

Cai, T. T., Li, H., Ma, J. & Xia, Y. (2018+). Differential Markov random fields analysis with applications to detecting differential micorbial communities.

Xia, Y., Cai, T., & Cai, T. T. (2015). Testing differential networks with applications to the detection of gene-gene interactions. Biometrika, 102(2), 247-266.

Ravikumar, P., Wainwright, M. J., & Lafferty, J. D. (2010). High-dimensional Ising model selection using l1-regularized logistic regression. The Annals of Statistics, 38(3), 1287-1319.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
library(glmnet)

set.seed(1)

p = 50    # number of variables
n = 100   # number of observations per replicate
n0 = 1000 # burn in tolerance
rho_high = 0.5  # signal strength 
rho_low = 0.1   # signal strength 
ncond = 2       # number of conditions to compare
eps = 8/n       # tolerance for extreme proportioned observations
q = (p*(p - 1))/2

##---(1) Generate the network  
g_sf = sample_pa(p, directed=FALSE)
Amat = as.matrix(as_adjacency_matrix(g_sf, type="both"))

##---(2) Generate the Theta 
Theta = vector("list", ncond)
weights = matrix(0, p, p)
upperTriangle(weights) = runif(q, rho_low, rho_high) * (2*rbinom(q, 1, 0.5) - 1)
weights = weights + t(weights)
Theta[[1]] = weights * Amat

##---(3) Generate the difference matrix
Delta = matrix(0, p, p)
upperTriangle(Delta) = runif(q, 1, 2)*sqrt(log(p)/n) * 
                   (match(1:q, sample(q, q*0.1), nomatch = 0)>0)*(2*rbinom(q, 1, 0.5) - 1)
Delta = Delta + t(Delta)

Theta[[2]] = Theta[[1]] + Delta
Theta[[1]] = Theta[[1]] - Delta

##---(4) Simulate data and choose tuning parameter
dat = vector("list", ncond)
lambda = vector("list", ncond)
for (k in 1:ncond){
  dat[[k]] = BMN.samples(Theta[[k]], n, n0, skip=1)
  tmp = sapply(1:p, function(i) as.numeric(table(dat[[k]][,i]))[1]/n )
  while(min(tmp)<eps || abs(1-max(tmp)<eps)){
    dat[[k]] = BMN.samples(Theta[[k]], n, n0, skip=10)
    tmp = sapply(1:p, function(i) as.numeric(table(dat[[k]][,i]))[1]/n )
  }
}

tune = multiple.tune.twosample(dat, 1:20, verbose = TRUE)
for (k in 1:ncond){
  empcov <- cov(dat[[k]])
  lambda[[k]] = (tune$b/20) * sqrt( diag(empcov) * log(p)/n )
}

##---(5) Two-sample testing
test = testTwoBMN(dat, lambda, alpha.multi = 0.20)

jingmafdu/TestBMN documentation built on Feb. 20, 2022, 5:24 p.m.