Description Usage Arguments Details Value Examples
False overlapped-cluster rate (FOCR) control procedures
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | focr_initial(
data,
data_corr,
scale,
blocks,
nblocks = ncol(data),
mu = 0,
alpha = 0.05,
verbose = FALSE,
side = c("two", "left", "right"),
...
)
focr(
data,
block_size,
alpha = 0.05,
fdr_method = c("BH", "LAWS", "SABHA", "BY"),
bandwidth = if (missing(block_size)) { NA } else { block_size/2 },
initial_filter = 0.9,
dimension = NULL,
distance_measure = c("euclidean", "lmax", "manhattan"),
side = c("two", "left", "right"),
verbose = FALSE,
blocks,
...
)
|
data |
a n-by-p numerical matrix (no missing values) with |
data_corr |
the correlation matrix of |
scale |
numerical vector of standard deviations by column; default is missing (use empirical standard deviation) |
blocks |
a list of indices or a function that returns indices |
nblocks |
the total number of blocks, used when |
mu |
the mean function value to compare with; see 'Details' |
alpha |
FOCR level for stage-I, and FDR level for stage-II |
verbose |
whether to print out information; default is false |
side |
test type, |
... |
passed to |
block_size |
block size of sliding window; used by |
fdr_method |
characters or function of post-selection FDR control
procedures. Built-in choices are |
bandwidth |
used by |
initial_filter |
used by |
dimension |
the dimension information of input hypotheses. For
|
distance_measure |
distance measure used to form blocks; see 'Details'. |
The function focr
and focr_initial
control the type-I error
for multiple testing problems with topological constraints:
H_{0}(s):f(s)=μ(s), H_{1}(s):f(s)\neq μ(s)
The type-I error control procedure has two stages. In the first stage, the FOCR is controlled at block (overlapped-cluster) level. This step is to find regions of interests that respect the topological constraints. The second stage further inspects the hypotheses rejected by the first stage. During this stage, conditional p-values will be calculated in a post-selection fashion. FDR control methods are further applied to these conditional p-values to select significant hypotheses at individual level.
Function μ(s) is specified in mu
. By default the alternative
hypothesis is two-sided. For one-sided tests, please change the parameter
side
to either "left"
or "right"
.
The function focr_initial
controls the FOCR on the block level
(stage-I), and calculates the conditional p-values. The function focr
uses focr_initial
, providing default block settings and built-in
post-selection inference on conditional p-values.
By default, focr
uses sliding window as blocks. Each block is a ball
with distance between the boundary and center point given
by block_size/2
. The distance measure is specified by
distance_measure
. The choices are "euclidean"
, "lmax"
,
and "manhattan"
. This default settings should work in many spatial
or temporal situations. However, in case the blocks are to be customized,
please specify blocks
manually. The argument blocks
can be
either a list of hypothesis indices, or a function that returns ones given
by locations of hypotheses. See 'vignette'
vignette('false-overlapped-cluster-rate', package='focr')
.
A list of results
method
method name
alpha
level of significance: FOCR in the stage-I and FDR in the stage-II
side
passed from input
blocks
function that returns indices of blocks
nblocks
number of total blocks
rej_blocks
blocks being rejected
rej_hypotheses
individual hypotheses rejected in the first stage
tau
p-value cutoff in the first stage
cond_pvals
conditional p-values in the stage-II
uncond_pvals
unconditional p-values
details
details of initial rejections
stats
block-level test statistics and p-values
The following additional items are focr
only.
post_selection
a list returned by FDR controlling methods,
see also fdr-controls
fdr_method
function used to control the FDR in stage-II
block_size
block size if specified, passed from input
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | library(focr)
set.seed(100)
generator <- simulation_data_1D(n_points = 1000, mu_type = 'step',
cov_type = 'AR')
data <- generator$gen_data(snr = 0.34)
plot(generator, data = data, snr = 0.34)
# -------------------- Basic usage -------------------------
# FOCR-BH procedure
res <- focr(data = data, block_size = 41,
alpha = 0.05, fdr_method = 'BH')
# False discovery proportion
fdp <- fdp(res$post_selection$rejs, generator$support)
fdp
# Statistical power
power <- pwr(res$post_selection$rejs, generator$support)
power
# Visualize
plot(generator$mu, type = 'l', col = 'red', ylim = c(-.5,1.5),
main = sprintf('FOCR-BH, FDP=%.1f%%, Power=%.1f%%',
fdp*100, power * 100))
lines(res$cond_pvals, col = 'gray')
abseg(res$rej_hypotheses, y = -0.3, col = 'orange3', lwd = 2)
abseg(res$post_selection$rejs, y = -0.5, col = 'blue', lwd = 2)
legend('topleft', c("Underlying signal", "Conditional p-values",
"FOCR initial clusters", "FOCR-BH final rejections"),
col = c('red', 'orange3', 'blue'), lty = 1, cex = 0.7)
# ------------------------- Change FDR methods --------------------
# FOCR-LAWS
res <- focr(data = data, block_size = 41,
alpha = 0.05, fdr_method = 'LAWS',
initial_filter = 0.5)
fdp <- fdp(res$post_selection$rejs, generator$support)
fdp
power <- pwr(res$post_selection$rejs, generator$support)
power
# Visualize
plot(generator$mu, type = 'l', col = 'red', ylim = c(-.5,1.5),
main = sprintf('FOCR-LAWS, FDP=%.1f%%, Power=%.1f%%',
fdp*100, power * 100))
lines(res$cond_pvals, col = 'gray')
abseg(res$rej_hypotheses, y = -0.3, col = 'orange3', lwd = 2)
abseg(res$post_selection$rejs, y = -0.5, col = 'blue', lwd = 2)
legend('topleft', c("Underlying signal", "Conditional p-values",
"FOCR initial clusters", "FOCR-LAWS final rejections"),
col = c('red', 'orange3', 'blue'), lty = 1, cex = 0.7)
# ------------------------- Customized blocks --------------------
# The following example uses disjoint blocks; each block has length of 40
res <- focr(data = data, alpha = 0.05, fdr_method = 'LAWS',
initial_filter = 0.5, blocks = function(index){
# Disjoint blocks with size 40
floor((index -1)/40) * 40 + seq_len(40)
}, bandwidth = 20)
# Compared to overlapped blocks, disjoint blocks are less powerful
# However, if this might be useful provided the underlying topological
# structure is disjoint
fdp <- fdp(res$post_selection$rejs, generator$support)
fdp
power <- pwr(res$post_selection$rejs, generator$support)
power
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.