multiple.tune.twosample: Model selection for two-sample multiple testing

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/multiple.tune.twosample.R

Description

The function multiple.tune.twosample selects the optimal tuning parameters for the multiple testing problem H_0: θ_{r,t,1}-θ_{r,t,2}=0 for all 1 ≤ r < t ≤ p.

Usage

1
multiple.tune.twosample(dat, B, verbose = FALSE)

Arguments

dat

A list of length 2, consisting of the two data matrices generated from two populations. The data matrix from the k-th population should be of dimension n_k x p for k=1,2.

B

The set of candidate integer multipliers.

verbose

Whether to print out intermediate iteration steps. Default is FALSE.

Details

The false discovery rate control in the multiple testing problem

H_0: θ_{r,t,1}-θ_{r,t,2}=0, 1≤ r < t ≤ p,

is based on approximating the number of false discoveries by {2-2Φ(t)}*p*(p-1)/2. Thus the optimal tuning parameter is selected with the principle of making the above approximation error to be as small as possible. The candidate tuning parameters are selected using a data-driven approach, therefore the user only needs to specify a candidate set of integer multipliers. Details on how the approximation error is defined are available in Xia et al. (2014).

Value

error

The empirical version of the approximation error, evaluated over the range of integer multipliers in B.

b

The optimal integer multiplier.

Author(s)

Jing Ma

References

Xia, Y., Cai, T., & Cai, T. T. (2015). Testing differential networks with applications to the detection of gene-gene interactions. Biometrika, 102(2), 247-266.

See Also

testTwoBMN, multiple.tune.onesample

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
library(glmnet)

set.seed(1)

p = 50    # number of variables
n = 100   # number of observations per replicate
n0 = 1000 # burn in tolerance
rho_high = 0.5  # signal strength 
rho_low = 0.1   # signal strength 
ncond = 2       # number of conditions to compare
eps = 8/n       # tolerance for extreme proportioned observations
q = (p*(p - 1))/2

##---(1) Generate the network  
g_sf = sample_pa(p, directed=FALSE)
Amat = as.matrix(as_adjacency_matrix(g_sf, type="both"))

##---(2) Generate the Theta 
Theta = vector("list", ncond)
weights = matrix(0, p, p)
upperTriangle(weights) <- runif(q, rho_low, rho_high) * (2*rbinom(q, 1, 0.5) - 1)
weights = weights + t(weights)
Theta[[1]] = weights * Amat

##---(3) Generate the difference matrix
Delta <- matrix(0, p, p)
upperTriangle(Delta) <- runif(q, 1, 2) * sqrt(log(p)/n) *
                   (match(1:q, sample(q, q*0.1), nomatch = 0)>0)*(2*rbinom(q, 1, 0.5) - 1)
Delta <- Delta + t(Delta)

Theta[[2]] = Theta[[1]] + Delta
Theta[[1]] = Theta[[1]] - Delta

##---(4) Simulate data and choose tuning parameter
dat = vector("list", ncond)
lambda = vector("list", ncond)
for (k in 1:ncond){
  dat[[k]] = BMN.samples(Theta[[k]], n, n0, skip=1)
  tmp = sapply(1:p, function(i) as.numeric(table(dat[[k]][,i]))[1]/n )
  while(min(tmp)<eps || abs(1-max(tmp)<eps)){
    dat[[k]] = BMN.samples(Theta[[k]], n, n0, skip=10)
    tmp = sapply(1:p, function(i) as.numeric(table(dat[[k]][,i]))[1]/n )
  }
}

tune = multiple.tune.twosample(dat, 1:20, verbose = TRUE)
for (k in 1:ncond){
  empcov <- cov(dat[[k]])
  lambda[[k]] = (tune$b/20) * sqrt( diag(empcov) * log(p)/n )
}

##---(5) Two-sample testing
test = testTwoBMN(dat, lambda, alpha.multi = 0.20)

jingmafdu/TestBMN documentation built on Feb. 20, 2022, 5:24 p.m.