Description Usage Arguments Details Value References Examples
This function runs the whole union knockoff procedure, i.e. it generates multiple knockoff matrices, estimates the score functions and the selection sets of multiple knockoff runs, which are then aggregated by their union to obtain the final selection set.
1 2 3 4 5 6 7 8 9 10 11 12 |
X |
n x p matrix or data frame of original variables. |
y |
response vector of length n. |
knockoffs |
function for the knockoff construction. It must take the n x p matrix as input
and it must return a n x p knockoff matrix. Either choose a knockoff sampler of
the |
statistic |
function that computes the score vector W of length p. It must take the data matrix,
knockoff matrix and response vector as input and outputs a vector of computed
scores. Either choose one score statistic from the |
qk |
sequence of nominal levels. Either choose |
q |
nominal level for the FDR control. Default: 0.2. |
K |
number of knockoff runs. Default: 5. |
q_seq |
manual sequence of nominal level which has to match in length
with the number of knockoff runs |
offset |
either 0 (knockoff) or 1 (knockoff+). Default: 1. |
sets |
logical argument if the K selection sets of each knockoff run
should be returned. Default: |
This function requires the installation of the knockoff
package prior to its execution.
The default knockoff sampler create.second_order
is the second-order Gaussian knockoff construction from
the knockoff
package.
The default score function stat.glmnet_coefdiff
is from the knockoff
package.
It fits a Lasso regression where the regularization parameter λ is tuned by cross-validation.
Then, the score is computed as the difference between
W_j = |Z_j| - |\tilde{Z}_j|
where Z_j and \tilde{Z}_j are the coefficient estimates for the jth variable and its knockoff, respectively.
The user has to specify either qk
together with q
to apply one of the pre-defined
nominal levels or has to define the argument q_seq
for an own sequence of nominal levels.
A list containing following components:
Shat |
aggregated selection set. |
K |
number of knockoff runs. |
FDRbound |
theoretical FDR bound. |
sets |
if specified, individual selection sets of each knockoff run. |
Xie and Lederer (2021). Aggregating Knockoffs for False Discovery Rate Control with an Application to Gut Microbiome Data. Entropy 23(2), 230. https://www.mdpi.com/1099-4300/23/2/230/xml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | n <- 400; p <- 200; s_0 <- 30
amplitude <- 1; mu <- rep(0,p); rho <- 0.25
Sigma <- toeplitz(rho^(0:(p-1)))
X <- MASS::mvrnorm(n, mu, Sigma)
nonzero <- sample(p, s_0)
beta <- amplitude * (1:p %in% nonzero)
y <- X %*% beta + rnorm(n)
# Basic usage with default arguments
res.uKO <- run.uKO(X, y, sets = TRUE)
res.uKO
# Advanced usage with customized knockoff construction (equi-correlated)
equi.knock <- function(X) create.second_order(X, method = "equi")
res.uKO <- run.uKO(X, y, knockoffs = equi.knock, sets = TRUE)
res.uKO
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.