Description Usage Arguments Details Value References

This function is the main entry point for the package. It runs the knockoff procedure which accomodates various covariate distributions and model-free associations using gradient boosting, and ultimately provides the selected variables for the input dataset. Users can specify whether to screen variables before selection and the extent of screening.

1 2 3 4 5 6 | ```
knockoffs.varsel(X, y, X_k, family = Gaussian(),
q=0.10, knockoff.method = c("sdp","asdp","svm"), knockoff.shrink = T,
stat = c("RRB", "LCD", "DRS"),
screen = T, screening.num = nrow(X), screening.knot = 10,
max.mstop = 100, baselearner = c("bbs", "bols", "btree"), cv.fold = 5,
threshold=c('knockoff','knockoff+')
``` |

`X` |
matrix of predictors |

`y` |
response vector, or a survival object with two columns |

`X_k` |
knockoff variables (n by p), if pre-specified |

`family` |
Binomial(), Binomial(link = “logit”, type=”glm”), Gaussian(), Poisson(), CoxPH(), Cindex(), GammaReg(), NBinomial(), Weibull(), Loglog(), Lognormal(), etc. See mboost documentation for details. |

`q` |
target FDR (false discovery rate) |

`knockoff.method` |
method to construct knockoffs. 'sdp' or 'asdp' means sampling second-order multivariate Gaussian knockoffs via either SDP or approximate SDP. 'svm' means constructing by regression, specifically, by support vector regression |

`knockoff.shrink` |
whether to shrink the estimated covariance matrix (default: FALSE) |

`stat` |
statistics measuring variable importance. 'RRB' represents risk reduction in boosting, 'LCD' represents Lasso coefficient difference, and 'DRS' represents difference in R-square (R-squares are obtained from boosting) |

`screen` |
whether to screen the variables (default: TRUE) |

`screening.num` |
number of variables left after screening (default: sample size) |

`screening.knot` |
parameter in screening process |

`max.mstop` |
maximum number of boosting iteration |

`baselearner` |
base-learners when fitting models using mboost. 'bols' means linear base-learners, 'bbs' penalized regression splines with a B-spline basis, and 'btree' boosts stumps. |

`cv.fold` |
number of folds in cross-validation to choose number of iteration |

`threshold` |
method to calculate knockoff threshold, either “knockoff” or “knockoff+” |

The default family for continuous response is Gaussian(), Binomial() for binary response, and CoxPH() for survival response.

A vector containing the selected covariate indices

Candes et al., Panning for Gold: Model-free Knockoffs for High-dimensional Controlled Variable Selection, arXiv:1610.02351 (2016). https://statweb.stanford.edu/~candes/MF_Knockoffs/index.html

Barber and Candes, Controlling the false discovery rate via knockoffs. Ann. Statist. 43 (2015), no. 5, 2055–2085. https://projecteuclid.org/euclid.aos/1438606853

Benjamin Hofner, Andreas Mayr, Nikolay Robinzonov and Matthias Schmid (2014). Model-based Boosting in R: A Hands-on Tutorial Using the R Package mboost. Computational Statistics, 29, 3–35. http://dx.doi.org/10.1007/s00180-012-0382-5 Available as vignette via: vignette(package = "mboost", "mboost_tutorial")

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.