Description Usage Arguments Details Value See Also Examples

Fit a generalized linear model via penalized maximum likelihood and cross-validation. Then, compute the difference statistic

*W_j = |Z_j| - |\tilde{Z}_j|*

where *Z_j* and *\tilde{Z}_j* are the coefficient estimates for the
jth variable and its knockoff, respectively. The value of the regularization
parameter *λ* is selected by cross-validation and computed with glmnet.

1 2 | ```
MFKnockoffs.stat.glmnet_coef_difference(X, X_k, y, family = "gaussian",
cores = 2, ...)
``` |

`X` |
original design matrix (size n-by-p) |

`X_k` |
knockoff matrix (size n-by-p) |

`y` |
response vector (length n). Quantitative for family="gaussian", or family="poisson" (non-negative counts). For family="binomial" should be either a factor with two levels, or a two-column matrix of counts or proportions (the second column is treated as the target class; for a factor, the last level in alphabetical order is the target class). For family="multinomial", can be a nc>=2 level factor, or a matrix with nc columns of counts or proportions. For either "binomial" or "multinomial", if y is presented as a vector, it will be coerced into a factor. For family="cox", y should be a two-column matrix with columns named 'time' and 'status'. The latter is a binary variable, with '1' indicating death, and '0' indicating right censored. The function Surv() in package survival produces such a matrix. For family="mgaussian", y is a matrix of quantitative responses. |

`family` |
Response type (see above) |

`cores` |
Number of cores used to compute the knockoff statistics by running cv.glmnet. Unless otherwise specified, the number of cores is set equal to two (if available). |

`...` |
additional arguments specific to 'cv.glmnet' (see Details) |

This function uses the `glmnet`

package to fit a generalized linear model
via penalized maximum likelihood.

The knockoff statistics *W_j* are constructed by taking the difference
between the coefficient of the j-th variable and its knockoff.

By default, the value of the regularization parameter is chosen by 10-fold cross-validation.

The default response family is 'gaussian', for a linear regression model. Different response families (e.g. 'binomial') can be specified by passing an optional parameter 'family'.

The optional `nlambda`

parameter can be used to control the granularity of the
grid of *λ*'s. The default value of `nlambda`

is `100`

,
where `p`

is the number of columns of `X`

.

If the family is 'binomial' and a lambda sequence is not provided by the user, this function generates it on a log-linear scale before calling 'glmnet'.

For a complete list of the available additional arguments, see cv.glmnet and glmnet.

A vector of statistics *W* (length p)

Other statistics for knockoffs: `MFKnockoffs.stat.forward_selection`

,
`MFKnockoffs.stat.glmnet_lambda_difference`

,
`MFKnockoffs.stat.lasso_coef_difference_bin`

,
`MFKnockoffs.stat.lasso_coef_difference`

,
`MFKnockoffs.stat.lasso_lambda_difference_bin`

,
`MFKnockoffs.stat.lasso_lambda_difference`

,
`MFKnockoffs.stat.random_forest`

,
`MFKnockoffs.stat.sqrt_lasso`

,
`MFKnockoffs.stat.stability_selection`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ```
p=100; n=200; k=15
mu = rep(0,p); Sigma = diag(p)
X = matrix(rnorm(n*p),n)
nonzero = sample(p, k)
beta = 3.5 * (1:p %in% nonzero)
y = X %*% beta + rnorm(n)
knockoffs = function(X) MFKnockoffs.create.gaussian(X, mu, Sigma)
# Basic usage with default arguments
result = MFKnockoffs.filter(X, y, knockoffs=knockoffs,
statistic=MFKnockoffs.stat.glmnet_coef_difference)
print(result$selected)
# Advanced usage with custom arguments
foo = MFKnockoffs.stat.glmnet_coef_difference
k_stat = function(X, X_k, y) foo(X, X_k, y, nlambda=200)
result = MFKnockoffs.filter(X, y, knockoffs=knockoffs, statistic=k_stat)
print(result$selected)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.