power.estimate: Power Estimation by Generalized Model-Free Knockoffs Filter
In hanfu-bios/varsel: Model-Free Knockoffs Filter for Controlled Variable Selection

Description Usage Arguments Details Value References

This function could be used for estimating power and FDP using the knockoffs filter prior to data collection. Once the user inputs the dimensions of data (sample size and number of covariates), and certain expectation for data structure and association type, this function could simulate data for multiple times and ultimately give an expected value for power and FDR.

power.estimate(n, p, X.dist=c("Gaussian","Binary","Exponential"), X.mu=rep(0,p), X.cov=diag(p),
                           beta = NULL, numTrue = NULL, percentTrue = NULL, amplitude=1,
                           association = c("linear","power","exponential","cosine"), power.degree=2,
                           link = c("identity","logit","survival"), family = NULL,
                           surv.lambdaT=.002, surv.lambdaC=.004, surv.shape=1,
                           nIterations = 10, ...)

`n`	sample size
`p`	number of covariates, including null variables
`X.dist`	distribution of design matrix. Either "Gaussian", "Binary" or "Exponential"
`X.mu`	expected values for X, a vector of length p (default: zero vector of length p)
`X.cov`	variance-covariance matrix (p by p) for X (default: identity)
`beta`	coefficients for p variables if known, a vector of length p
`numTrue`	number of true signals among p variables
`percentTrue`	percentage of true signals among p variables
`amplitude`	signal amplitude
`association`	association between predictors and response (on the scale of linear predictors). The linear predictor will be Xbeta when the input argument is "linear", X^[some power]beta when "power", exp(X)beta when "exponential", and cos(X)beta when "cosine".
`power.degree`	power degree when the "power" association is selected (default: 2)
`link`	link function between linear predictor and the response. "identity" for identity link and "logit" for logit link. If "survival" is selected, then survival response will be generated using the hazard function in Cox model.
`family`	Binomial(), Binomial(link = <e2><80><9c>logit<e2><80><9d>, type=<e2><80><9d>glm<e2><80><9d>), Gaussian(), Poisson(), CoxPH(), Cindex(), GammaReg(), NBinomial(), Weibull(), Loglog(), Lognormal(), etc. See mboost documentation for details.
`surv.lambdaT`	baseline hazard in survival response, default: 0.002
`surv.lambdaC`	hazard of censoring in survival response, default: 0.004
`surv.shape`	shape parameter of weibull distribution, default: 1
`nIterations`	number of runs to get the means / distributions of estimated power and FDR
`...`	further arguments passed to function selection

At least one of the three arguments, beta, numTrue, and percentTrue, must be specified, or, an error would appear. For now, the signal amplitude is set to be identical for all the true signals. Generalizations could be made in the future.

A list containing expected value of power, a list of power values from all experiments, standard deviation of power, mean value of FDR achieved (expected to be around the target value)

Candes et al., Panning for Gold: Model-free Knockoffs for High-dimensional Controlled Variable Selection, arXiv:1610.02351 (2016). https://statweb.stanford.edu/~candes/MF_Knockoffs/index.html

Barber and Candes, Controlling the false discovery rate via knockoffs. Ann. Statist. 43 (2015), no. 5, 2055–2085. https://projecteuclid.org/euclid.aos/1438606853

Benjamin Hofner, Andreas Mayr, Nikolay Robinzonov and Matthias Schmid (2014). Model-based Boosting in R: A Hands-on Tutorial Using the R Package mboost. Computational Statistics, 29, 3<e2><80><93>35. http://dx.doi.org/10.1007/s00180-012-0382-5 Available as vignette via: vignette(package = "mboost", "mboost_tutorial")

hanfu-bios/varsel documentation built on May 27, 2019, 4:50 a.m.