fit_null_glmmkin: Fit generalized linear mixed model with known relationship...

View source: R/fit_null_glmmkin.R

fit_null_glmmkinR Documentation

Fit generalized linear mixed model with known relationship matrices under the null hypothesis for related samples.

Description

The fit_null_glmmkin function is a wrapper of the glmmkin function from the GMMAT package that fits a regression model under the null hypothesis for related samples, which provides the preliminary step for subsequent variant-set tests in whole-genome sequencing data analysis. See glmmkin for more details.

Usage

fit_null_glmmkin(
  fixed,
  data = parent.frame(),
  kins,
  use_sparse = NULL,
  kins_cutoff = 0.022,
  id,
  random.slope = NULL,
  groups = NULL,
  family = binomial(link = "logit"),
  method = "REML",
  method.optim = "AI",
  maxiter = 500,
  tol = 1e-05,
  taumin = 1e-05,
  taumax = 1e+05,
  tauregion = 10,
  verbose = FALSE,
  ...
)

Arguments

fixed

an object of class formula (or one that can be coerced to that class): a symbolic description of the fixed effects model to be fitted.

data

a data frame or list (or object coercible by as.data.frame to a data frame) containing the variables in the model.

kins

a known positive semi-definite relationship matrix (e.g. kinship matrix in genetic association studies) or a list of known positive semi-definite relationship matrices. The rownames and colnames of these matrices must at least include all samples as specified in the id column of the data frame data.

use_sparse

a logical switch of whether the provided dense kins matrix should be transformed to a sparse matrix (default = NULL).

kins_cutoff

the cutoff value for clustering samples to make the output matrix sparse block-diagonal (default = 0.022).

id

a column in the data frame data, indicating the id of samples. When there are duplicates in id, the data is assumed to be longitudinal with repeated measures.

random.slope

an optional column indicating the random slope for time effect used in a mixed effects model for longitudinal data. It must be included in the names of data. There must be duplicates in id and method.optim must be "AI" (default = NULL).

groups

an optional categorical variable indicating the groups used in a heteroscedastic linear mixed model (allowing residual variances in different groups to be different). This variable must be included in the names of data, and family must be "gaussian" and method.optim must be "AI" (default = NULL).

family

a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See family for details of family functions).

method

method of fitting the generalized linear mixed model. Either "REML" or "ML" (default = "REML").

method.optim

optimization method of fitting the generalized linear mixed model. Either "AI", "Brent" or "Nelder-Mead" (default = "AI").

maxiter

a positive integer specifying the maximum number of iterations when fitting the generalized linear mixed model (default = 500).

tol

a positive number specifying tolerance, the difference threshold for parameter estimates below which iterations should be stopped (default = 1e-5).

taumin

the lower bound of search space for the variance component parameter \tau (default = 1e-5), used when method.optim = "Brent". See Details.

taumax

the upper bound of search space for the variance component parameter \tau (default = 1e5), used when method.optim = "Brent". See Details.

tauregion

the number of search intervals for the REML or ML estimate of the variance component parameter \tau (default = 10), used when method.optim = "Brent". See Details.

verbose

a logical switch for printing detailed information (parameter estimates in each iteration) for testing and debugging purpose (default = FALSE).

...

additional arguments that could be passed to glm.

Value

The function returns an object of the model fit from glmmkin (obj_nullmodel), with additional elements indicating the samples are related (obj_nullmodel$relatedness = TRUE), and whether the kins matrix is sparse when fitting the null model. See glmmkin for more details.

References

Chen, H., et al. (2016). Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. The American Journal of Human Genetics, 98(4), 653-666. (pub)

Chen, H., et al. (2019). Efficient variant set mixed model association tests for continuous and binary traits in large-scale whole-genome sequencing studies. The American Journal of Human Genetics, 104(2), 260-274. (pub)

Chen, H. (2023). GMMAT: Generalized linear Mixed Model Association Tests Version 1.4.2. (web)


xihaoli/STAAR documentation built on Nov. 3, 2024, 9:34 p.m.