Description Usage Arguments Details Value Author(s) References See Also Examples
Fit a linear model via a fast coordinate variational Bayes algorithm. Applicable to linear and logistic regression, and solves the problem on either a path of the spike (l0) parameter or at a fixed value based on the data-dimensions.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | vbsr(y,
X,
ordering_mat=NULL,
eps=1e-6,
exclude=NULL,
add.intercept=TRUE,
maxit = 1e4,
n_orderings = 10,
family = "normal",
scaling = TRUE,
return_kl = TRUE,
estimation_type = "BMA",
bma_approximation = TRUE,
screen = 1.0,
post=0.95,
already_screened = 1.0,
kl = 0.99,
l0_path=NULL,
cleanSolution=FALSE)
|
y |
response variable. Normally distributed errors for |
X |
Design matrix, an n x m matrix, with rows as observations |
ordering_mat |
Optionally specified coordinate update ordering matrix. Must be
in matrix form with columns as permutation vectors of length m, and there must be
|
eps |
Tolerance used to determine convergence of the algorithm based on the lower bound. |
exclude |
An optional indicator vector of length m of 0's and 1's indicating whether to penalize a particular variable or not (0=penalize, 1=unpenalized) |
add.intercept |
A boolean variable indicating whether or not to include an unpenalized intercept variable. |
maxit |
The maximum number of iterations to run the algorithm for a given solution to a penalized regression problem. |
n_orderings |
The number of random starts used. |
family |
The type of error model used. Currently supported modes are |
scaling |
A boolean variable indicating whether or not to scale the columns of X to have mean zero and variance one. |
return_kl |
A boolean variable indicating whether or not to return an analysis of the null distributed features in the data-set as a function of the penalty parameter. |
estimation_type |
The type of estimation to perform based on the number of unique solution identified
to the penalized regression problem. Valid values are |
.
bma_approximation |
A boolean variable indicating whether to compute a full correction to the
|
screen |
P-value to do marginal screening. Default is to not do marginal prescreening (e.g marginal p-value of 1.0) |
post |
Choice of penalty parameter such that a feature will have a posterior probability of 0.95 if it passes a Bonferroni correction in the multivariate model. Default is |
already_screened |
If features are already screened, the marginal p-value used for screening. |
kl |
The inner percentiles of the distribution to compute the Kullback-Leibler overfitting statistic. Only works for analysis when directly specifying a path of penalization parameter (e.g. |
l0_path |
The path of penalty parameters to solve the spike regression problem. If |
cleanSolution |
This parameter determines whether a given solution is further filtered using an unpenalized model. If |
The solutions to the spike penalized regression model are fit with a
coordinate variational Bayes algorithm based on the l0_path
values
of the spike hyper-parameter.
A list with all the results of the vbsr analysis.
beta |
The expected value of the penalized regression coefficients. |
alpha |
The estimated value of the unpenalized regression coefficients. |
z |
The Z-statistic for each penalized regression coefficient |
pval |
The p-values based on the asymptotic normal assumption of the Z-statistics |
post |
The posterior probabilities of each of the regression coefficients |
l0 |
The penalty parameters used to solve the penalized regression problem |
modelEntropy |
The entropy of the identified approximate posterior probability distribution over model space. |
modelProb |
The approximate posterior probability distribution over the identified model space. |
kl_index |
If a path solution was run with the KL diagnostic statistic then the points in the path where the KL statistic is nearest the min, the mean, the min + 1 s.e., and the mean +1 s.e. |
kl |
The KL statistic computed across the path |
kl_min |
The minimum KL statistic identified along the path |
kl_mean |
The expected KL statistic given the number of features identified |
Benjamin A. Logsdon
Logsdon, B.A, G.E. Hoffman, and J.G. Mezey (2010) A variational Bayes algorithm for fast and accurate multiple locus genome-wide association analysis, http://www.biomedcentral.com/1471-2105/11/58, BMC Bioinformatics, Vol. 11(1), 58
Logsdon, B.A., G.E. Hoffman, and J.G. Mezey, (2012). Mouse obesity network reconstruction with a variational Bayes algorithm to employ aggresive false positive control, http://www.biomedcentral.com/1471-2105/13/53/, BMC Bioinformatics, Vol. 13(1), 53
Logsdon, B.A., C.L. Carty, A.P. Reiner, J.Y. Dai, and C. Kooperberg (2012). A novel variational Bayes multiple locus Z-statistic for genome-wide association studies with Bayesian model averaging. Bioinformatics, Vol. 28(13), 1738-1744
1 2 3 4 5 6 7 8 9 10 11 12 13 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.