# R/Oracle.GBH.R In structSSI: Multiple Testing for Hypotheses with Hierarchical or Group Structure

#### Documented in Oracle.GBH

```  # This function performs the Group Benjamini-Hochberg procedure at level alpha.
#
# Input: 1) unadjp - [vector numerics between 0 and 1] - The unadjusted p-values that have been computed by testing
# multiple hypotheses tests. This is the same as what one would input into the
# multtest function mt.rawp2adjp(), for example.
# 2) group.index - [same as other groups vectors] - A vector that contains the group labeling for each hypothesis where
# the index of each hypothesis is the same as the index used for the unadjusted
# p-values above. For example, if the first 3 hypotheses were in group 1 and
# the last 2 were in group 2, then we would input (1, 1, 1, 2, 2) as the ``groups''
# vector.
# 3) pi.groups - [vector numerics between 0 and 1, length = # groups] - This is the
# vector of proportion of true null hypotheses that corresponds
# to each group. In the adaptive procedure, these proportions are estimated from
# the data. For example, if the hypotheses were divided into two groups, and
# we found that the proportion of true null hypotheses in the first group was 0.9
# and the proportion of true null hypotheses in the second group was 0.6, then we
# would input pi.groups = c(0.9, 0.6). This is the aspect of the Group-Benjamini
# Hochberg procedure that gives it more information to operate on than the
# standard BH procedure: We are able to direct attention to those hypotheses
# that are in groups where the proportion of true null hypotheses is estimated
# to be low.
#
# Output: 1) GBH.results - A list including (a) the sorted adjusted p-values and (b) the
# indices of the original hypotheses /uadjusted p-values that these adjusted p-values
# correspond to (c) the hypotheses that were rejected, labeled by their original
# unadj.p indexing, and (d) the hypotheses that were not rejected, also labeled
# by their original unadj.p indexing.

Oracle.GBH <- function(unadj.p, group.index, pi.groups, alpha = 0.05){
if(!all(group.index %in% names(pi.groups))) {
stop('Names of pi.groups vector must match
the elements of groups vector.')
}
stop('Length of p values vector does not match
group indexing vector.')
}

names.sort <- sort(names(pi.groups))
n_g <- table(group.index)
pi0 <- 1/N * sum(n_g[names.sort] * pi.groups[names.sort])

# The first part of the procedure involves weighting p-values. This is where the
# known group structure information is being explicitly accounted for.
pi.groups.match <- pi.groups[as.character(group.index)]
p.weighted <- unadj.p * (pi.groups.match / (1 - pi.groups.match))

# The second part of the procedure is exactly like Benjamini-Hochberg in that it
# is a step-up procedure where we compared ordered p-values to some constant factor
# times alpha, where the constant is determined by the position of the p-value in
# the ordered list.

sorting.weighted.p <- sort(p.weighted, index.return = TRUE)
p.weighted <- sorting.weighted.p\$x
p.weighted.index <- sorting.weighted.p\$ix

if(pi0 < 1) {
adjp.temp <- N * (1 - pi0) * p.weighted / 1 : N
} else {
}

'group' = group.index[p.weighted.index],