Description Usage Arguments Details Value Author(s) Examples
This function takes a Dataset object and fits the specified glmnet model on each taxon. righn now only alpha = 0 (ie. ridge regularization) is supported. The function uses 10-fold cross-validation to select lambda, and permutation to obtain p-values.
1 2 3 4 |
Dat |
A Dataset object. |
X |
A numeric matrix to be used as desing matrix for the model. If missing, the script will use the formula interface with the overparametrized_model.matrix function to obtain a valid design matrix. |
formula |
Right hand side of the formula to be used for the model. Variable names must correspond to header names in Map portion of the Dataset object. Ignored if X is passed. Note that offset is not supported. |
family |
The model family to be used. See glmnet for more help. |
alpha |
The elastic net parameter. 1 corresponds ot Lasso and 0 to Ridge, with intermediate values cirrespondig to mixtures between the two. This function only supports 0 for the moment and will use that value regardless of what the user supplies. |
glmnet.intercept |
Logical indicating whether to add an intercept. See intercept option in glmnet |
glmnet.offset |
Either a character indicating the name of the variable to be used as offset; or a numeric vector of length equal to the number of samples to be used directly as offset, see offset option in glmnet. Note that family multinomial takes a matrix as offset, while passing a vector of variable names should work, note that this has not been tested and might produce incorrect results. The function will print a warning if you try to do this. |
method |
Either "permuation" or "bootstrap". The default and preferred method is permutation. |
nperm |
Number of permutations to perform. |
nboot |
Number of bootstrap pseudo replicates to perform. |
plot |
Currently not implemented. The funciton contains code for plotting null distributions of parameters and observed values. Might eventually be an independent function. |
theme |
ggplot2 theme for plots. |
verbose |
Logical indicating if progress should be printed |
Uses cv.glmnet with 10-fold cross-validation to pick the optiomal lambda value and then uses glmnet to fit each permuted dataset.
Returns a list with the following elements
X |
Design matrix |
coeffcients |
Matrix of coefficients per taxon |
lambda |
Vector of lambda values per taxon |
method |
Method used for the replicates |
family |
Model family |
nreps |
Number of replicates performed |
call |
Function call |
formula |
model formula |
samples |
List of coefficiente matrices per OTU and replicate |
Sur from Dangl Lab
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | # Load data
data(Rhizo)
data(Rhizo.map)
# Create dataset with the first 10 OTUS
Dat <- create_dataset(Rhizo[1:10,],Rhizo.map)
# Always set seed before using random numbers (like in permuations)
set.seed(743)
m1 <- matrix_glmnet(Dat = Dat, formula = ~ fraction + accession,
nperm = 100,family = "poisson")
# Use summary function to obtain statistics
m1.sum <- summary(m1)
# See distribution of pvalues behaves as expectee
hist(m1.sum$coefficients$p.value)
# Use bootstrap instead
m1 <- matrix_glmnet(Dat = Dat, formula = ~ fraction + accession,
nperm = 100,family = "poisson", method ="bootstrap")
# Use summary function to obtain statistics
m1.sum <- summary(m1)
# Visualizing coefficientw with ggplot requires a bit of rearrangement because
# the percent symbol in the column names causes trouble
dat <- subset(m1.sum$coefficients, Taxon == "OTU_14834")
dat$lower <- dat[,"2.5%"]
dat$upper <- dat[,"97.5%"]
p1 <- ggplot(dat,aes(x = lower, y = Variable)) +
geom_segment(aes(xend = upper,yend = Variable)) +
geom_vline(xintercept = 0, col = 'red') +
theme_blackbox
p1
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.