matrix_glmnet: Fit a glmnet model on Dataset object.

Description Usage Arguments Details Value Author(s) Examples

Description

This function takes a Dataset object and fits the specified glmnet model on each taxon. righn now only alpha = 0 (ie. ridge regularization) is supported. The function uses 10-fold cross-validation to select lambda, and permutation to obtain p-values.

Usage

1
2
3
4
matrix_glmnet(Dat, X = NULL, formula = NULL, family = "binomial",
  alpha = 0, glmnet.intercept = TRUE, glmnet.offset = NULL,
  method = "permutation", nperm = 1000, nboot = nperm, plot = FALSE,
  theme = theme_blackbox, verbose = TRUE)

Arguments

Dat

A Dataset object.

X

A numeric matrix to be used as desing matrix for the model. If missing, the script will use the formula interface with the overparametrized_model.matrix function to obtain a valid design matrix.

formula

Right hand side of the formula to be used for the model. Variable names must correspond to header names in Map portion of the Dataset object. Ignored if X is passed. Note that offset is not supported.

family

The model family to be used. See glmnet for more help.

alpha

The elastic net parameter. 1 corresponds ot Lasso and 0 to Ridge, with intermediate values cirrespondig to mixtures between the two. This function only supports 0 for the moment and will use that value regardless of what the user supplies.

glmnet.intercept

Logical indicating whether to add an intercept. See intercept option in glmnet

glmnet.offset

Either a character indicating the name of the variable to be used as offset; or a numeric vector of length equal to the number of samples to be used directly as offset, see offset option in glmnet. Note that family multinomial takes a matrix as offset, while passing a vector of variable names should work, note that this has not been tested and might produce incorrect results. The function will print a warning if you try to do this.

method

Either "permuation" or "bootstrap". The default and preferred method is permutation.

nperm

Number of permutations to perform.

nboot

Number of bootstrap pseudo replicates to perform.

plot

Currently not implemented. The funciton contains code for plotting null distributions of parameters and observed values. Might eventually be an independent function.

theme

ggplot2 theme for plots.

verbose

Logical indicating if progress should be printed

Details

Uses cv.glmnet with 10-fold cross-validation to pick the optiomal lambda value and then uses glmnet to fit each permuted dataset.

Value

Returns a list with the following elements

X

Design matrix

coeffcients

Matrix of coefficients per taxon

lambda

Vector of lambda values per taxon

method

Method used for the replicates

family

Model family

nreps

Number of replicates performed

call

Function call

formula

model formula

samples

List of coefficiente matrices per OTU and replicate

Author(s)

Sur from Dangl Lab

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Load data
data(Rhizo)
data(Rhizo.map)

# Create dataset with the first 10 OTUS
Dat <- create_dataset(Rhizo[1:10,],Rhizo.map)

# Always set seed before using random numbers (like in permuations)
set.seed(743)
m1 <- matrix_glmnet(Dat = Dat, formula = ~ fraction + accession,
                    nperm = 100,family = "poisson")

# Use summary function to obtain statistics
m1.sum <- summary(m1)
# See distribution of pvalues behaves as expectee
hist(m1.sum$coefficients$p.value)

# Use bootstrap instead
m1 <- matrix_glmnet(Dat = Dat, formula = ~ fraction + accession,
                    nperm = 100,family = "poisson", method ="bootstrap")

# Use summary function to obtain statistics
m1.sum <- summary(m1)

# Visualizing coefficientw with ggplot requires a bit of rearrangement because
# the percent symbol in the column names causes trouble
dat <- subset(m1.sum$coefficients, Taxon == "OTU_14834")
dat$lower <- dat[,"2.5%"]
dat$upper <- dat[,"97.5%"]
p1 <- ggplot(dat,aes(x = lower, y = Variable)) +
  geom_segment(aes(xend = upper,yend = Variable)) +
  geom_vline(xintercept = 0, col = 'red') +
  theme_blackbox
p1

surh/AMORglmnet documentation built on May 30, 2019, 8:41 p.m.