GLMMMiRKAT | R Documentation |
GLMMMiRKAT utilizes a generalized linear mixed model to allow dependence among samples.
GLMMMiRKAT( y, X = NULL, Ks, id = NULL, time.pt = NULL, model, method = "perm", slope = FALSE, omnibus = "perm", nperm = 5000 )
y |
A numeric vector of Gaussian (e.g., body mass index), Binomial (e.g., disease status, treatment/placebo) or Poisson (e.g., number of tumors/treatments) traits. |
X |
A vector or matrix of numeric covariates, if applicable (default = NULL). |
Ks |
A list of n-by-n OTU kernel matrices or one singular n-by-n OTU kernel matrix, where n is sample size. |
id |
A vector of cluster (e.g., family or subject including repeated measurements) IDs. Defaults to NULL since it is unnecessary for the CSKAT call. |
time.pt |
A vector of time points for the longitudinal studies. 'time.pt' is not required (i.e., 'time.pt = NULL') for the random intercept model. Default is time.pt = NULL. |
model |
A string declaring which model ("gaussian", "binomial" or "poisson") is to be used; should align with whether a Gaussian, Binomial, or Poisson trait is being inputted for the y argument. |
method |
A string declaring which method ("perm" or "davies) will be used to calculate the p-value. Davies is only available for Gaussian traits. Defaults to "perm". |
slope |
An indicator to include random slopes in the model (slope = TRUE) or not (slope = FALSE). 'slope = FALSE' is for the random intercept model. 'slope = TRUE' is for the random slope model. For the random slope model (slope = TRUE), 'time.pt' is required. |
omnibus |
A string equal to either "Cauchy" or "permutation" (or nonambiguous abbreviations thereof), specifying whether to use the Cauchy combination test or residual permutation to generate the omnibus p-value. |
nperm |
The number of permutations used to calculate the p-values and omnibus p-value. Defaults to 5000. |
Missing data is not permitted. Please remove all individuals with missing y, X, and Ks prior to input for analysis.
y and X (if not NULL) should be numerical matrices or vectors with the same number of rows.
Ks should either be a list of n by n kernel matrices (where n is sample size) or a single kernel matrix. If you have distance matrices from metagenomic data, each kernel can be constructed through function D2K. Each kernel can also be constructed through other mathematical approaches.
If model="gaussian" and method="davies", CSKAT is called. CSKAT utilizes the same omnibus test as GLMMMiRKAT. See ?CSKAT for more details.
The "method" argument only determines kernel-specific p-values are generated. When Ks is a list of multiple kernels, an omnibus p-value is computed via permutation.
Returns a p-value for each inputted kernel matrix, as well as an overall omnibus p-value if more than one kernel matrix is inputted
p_values |
p-value for each individual kernel matrix |
omnibus_p |
overall omnibus p-value calculated by permutation for the adaptive GLMMMiRKAT analysis |
Hyunwook Koh
Koh H, Li Y, Zhan X, Chen J, Zhao N. (2019) A distance-based kernel association test based on the generalized linear mixed model for correlated microbiome studies. Front. Genet. 458(10), 1-14.
## Example with Gaussian (e.g., body mass index) traits ## For non-Gaussian traits, see vignette. # Import example microbiome data with Gaussian traits data(nordata) otu.tab <- nordata$nor.otu.tab meta <- nordata$nor.meta # Create kernel matrices and run analysis if (requireNamespace("vegan")) { library(vegan) D_BC = as.matrix(vegdist(otu.tab, 'bray')) K_BC = D2K(D_BC) GLMMMiRKAT(y = meta$y, X = cbind(meta$x1, meta$x2), id = meta$id, Ks = K_BC, model = "gaussian", nperm = 500) } else { # Computation time is longer for phylogenetic kernels tree <- nordata$nor.tree unifracs <- GUniFrac::GUniFrac(otu.tab, tree, alpha=c(1))$unifracs D_W <- unifracs[,,"d_1"] K_W = D2K(D_W) GLMMMiRKAT(y = meta$y, X = cbind(meta$x1, meta$x2), id = meta$id, Ks = K_W, model = "gaussian", nperm = 500) }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.