multiple.tune.twosample: Model selection for two-sample multiple testing
In jingmafdu/TestBMN: Testing for binary Markov networks

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/multiple.tune.twosample.R

The function multiple.tune.twosample selects the optimal tuning parameters for the multiple testing problem H_0: θ_{r,t,1}-θ_{r,t,2}=0 for all 1 ≤ r < t ≤ p.

1	multiple.tune.twosample(dat, B, verbose = FALSE)

`dat`	A list of length 2, consisting of the two data matrices generated from two populations. The data matrix from the k-th population should be of dimension n_k x p for k=1,2.
`B`	The set of candidate integer multipliers.
`verbose`	Whether to print out intermediate iteration steps. Default is `FALSE`.

The false discovery rate control in the multiple testing problem

H_0: θ_{r,t,1}-θ_{r,t,2}=0, 1≤ r < t ≤ p,

is based on approximating the number of false discoveries by {2-2Φ(t)}*p*(p-1)/2. Thus the optimal tuning parameter is selected with the principle of making the above approximation error to be as small as possible. The candidate tuning parameters are selected using a data-driven approach, therefore the user only needs to specify a candidate set of integer multipliers. Details on how the approximation error is defined are available in Xia et al. (2014).

`error`	The empirical version of the approximation error, evaluated over the range of integer multipliers in `B`.
`b`	The optimal integer multiplier.

Jing Ma

Xia, Y., Cai, T., & Cai, T. T. (2015). Testing differential networks with applications to the detection of gene-gene interactions. Biometrika, 102(2), 247-266.

testTwoBMN, multiple.tune.onesample

library(glmnet)

set.seed(1)

p = 50    # number of variables
n = 100   # number of observations per replicate
n0 = 1000 # burn in tolerance
rho_high = 0.5  # signal strength 
rho_low = 0.1   # signal strength 
ncond = 2       # number of conditions to compare
eps = 8/n       # tolerance for extreme proportioned observations
q = (p*(p - 1))/2

##---(1) Generate the network  
g_sf = sample_pa(p, directed=FALSE)
Amat = as.matrix(as_adjacency_matrix(g_sf, type="both"))

##---(2) Generate the Theta 
Theta = vector("list", ncond)
weights = matrix(0, p, p)
upperTriangle(weights) <- runif(q, rho_low, rho_high) * (2*rbinom(q, 1, 0.5) - 1)
weights = weights + t(weights)
Theta[[1]] = weights * Amat

##---(3) Generate the difference matrix
Delta <- matrix(0, p, p)
upperTriangle(Delta) <- runif(q, 1, 2) * sqrt(log(p)/n) *
                   (match(1:q, sample(q, q*0.1), nomatch = 0)>0)*(2*rbinom(q, 1, 0.5) - 1)
Delta <- Delta + t(Delta)

Theta[[2]] = Theta[[1]] + Delta
Theta[[1]] = Theta[[1]] - Delta

##---(4) Simulate data and choose tuning parameter
dat = vector("list", ncond)
lambda = vector("list", ncond)
for (k in 1:ncond){
  dat[[k]] = BMN.samples(Theta[[k]], n, n0, skip=1)
  tmp = sapply(1:p, function(i) as.numeric(table(dat[[k]][,i]))[1]/n )
  while(min(tmp)<eps || abs(1-max(tmp)<eps)){
    dat[[k]] = BMN.samples(Theta[[k]], n, n0, skip=10)
    tmp = sapply(1:p, function(i) as.numeric(table(dat[[k]][,i]))[1]/n )
  }
}

tune = multiple.tune.twosample(dat, 1:20, verbose = TRUE)
for (k in 1:ncond){
  empcov <- cov(dat[[k]])
  lambda[[k]] = (tune$b/20) * sqrt( diag(empcov) * log(p)/n )
}

##---(5) Two-sample testing
test = testTwoBMN(dat, lambda, alpha.multi = 0.20)