multi.knockfilter: Multiple knockoff filter

Description Usage Arguments Details Value References Examples

View source: R/multiknockfilter.R

Description

This function estimates multiple knockoff runs with different knockoff matrices.

Usage

1
2
3
4
5
6
7
8
multi.knockfilter(
  X,
  Xk,
  y,
  q = 0.2,
  offset = 1,
  statistic = stat.glmnet_coefdiff
)

Arguments

X

n x p matrix or data frame of original variables.

Xk

list with K elements containing the n x p knockoff matrices.

y

response vector of length n.

q

either a scalar or vector of nominal levels. If a scalar is supplied, then the same nominal level is used for each knockoff run. Default: 0.2.

offset

either 0 (knockoff) or 1 (knockoff+). Default: 1.

statistic

function that computes the score vector W of length p. It must take the data matrix, knockoff matrix and response vector as input and outputs a vector of computed scores. Either choose one score statistic from the knockoff package or define it manually. Default: stat.glmnet_coefdiff (see below).

Details

This function requires the installation of the knockoff package prior to its execution.

The default score function stat.glmnet_coefdiff is from the knockoff package. It fits a Lasso regression where the regularization parameter λ is tuned by cross-validation. Then, the score is computed as the difference between

W_j = |Z_j| - |\tilde{Z}_j|

where Z_j and \tilde{Z}_j are the coefficient estimates for the jth variable and its knockoff, respectively.

The function should be used in combination with multi.knockoffs (see example).

Value

A list containing following components:

W.list

the K score vectors of each knockoff run.

Shat.list

the K selection sets of each knockoff run.

q

the nominal level of each knockoff run.

References

Candes, Fan, Janson, and Lv (2018). Panning for gold. model-X knockoffs for high dimensional controlled variable selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 80(3), 551-577.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
n <- 400; p <- 200; s_0 <- 30
amplitude <- 1; mu <- rep(0,p); rho <- 0.25
Sigma = toeplitz(rho^(0:(p-1)))

X <- MASS::mvrnorm(n, mu, Sigma)
nonzero <- sample(p, s_0)
beta <- amplitude * (1:p %in% nonzero)
y <- X %*% beta + rnorm(n)

# Construction of K knockoff matrices
Xk <- multi.knockoffs(X, K = 5)

# Basic usage with default arguments
multi.res <- multi.knockfilter(X, Xk, y)

cKarypidis/multiknockoffs documentation built on Dec. 19, 2021, 12:53 p.m.