The MFKnockoffs
package can also be used to perform controlled variable selection with a fixed design matrix, assuming a linear regression model for the response. In this sense, MFKnockoffs
is a superset of the original knockoffs package.
set.seed(1234)
# Problem parameters n = 1000 # number of observations p = 300 # number of variables k = 30 # number of variables with nonzero coefficients amplitude = 4.5 # signal amplitude (for noise level = 1) # Generate the variables from a multivariate normal distribution mu = rep(0,p); Sigma = diag(p) X = matrix(rnorm(n*p),n) # Generate the response from a linear model nonzero = sample(p, k) beta = amplitude * (1:p %in% nonzero) / sqrt(n) y.sample = function(X) X %*% beta + rnorm(n) y = y.sample(X)
In order to create fixed-design knockoffs, we call MFKnockoffs.filter
with the parameter statistic
equal to MFKnockoffs.stat.glmnet_lambda_difference
. Moreover, since not all statistics are valid with fixed-design knockoffs, we use MFKnockoffs.stat.glmnet_lambda_difference
instead of the default one (which is based on cross-validation).
library(MFKnockoffs) result = MFKnockoffs.filter(X, y, knockoffs = MFKnockoffs.create.fixed, statistic = MFKnockoffs.stat.glmnet_lambda_difference)
We can display the results with
print(result)
The default value for the target false discovery rate is 0.1. In this experiment the false discovery proportion is
fdp = function(selected) sum(beta[selected] == 0) / max(1, length(selected)) fdp(result$selected)
If you want to see some basic usage of the knockoff filter, see the introductory vignette. If you want to look inside the knockoff filter, see the advanced vignette.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.