ewaff.sites: Test associations at CpG sites

View source: R/sites.r

ewaff.sitesR Documentation

Test associations at CpG sites

Description

Fit generalized linear model (GLM) to methylation levels each CpG site.

Usage

ewaff.sites(
  formula,
  variable.of.interest,
  methylation,
  data,
  family = gaussian,
  method = "glm",
  generate.confounders = NULL,
  n.confounders = NULL,
  most.variable = NULL,
  random.subset = 0.05,
  ...,
  debug = F
)

Arguments

formula

An object of class formula: a symbolic description of the model to be fitted. DNA methylation is referred to by the variable name 'methylation'.

variable.of.interest

Name of variable(s) in the model formula for which to save summary statistics. If it is the dependent variable in the formula, then it must be numeric or binary; otherwise, it may be any type of variable or even a vector of variables. The value is ignored if method == "coxph".

methylation

DNA methylation matrix, one row per CpG site, one column per sample.

data

Data frame of variables to include in the model.

family

See description for glm.

method

Method for regressions: "glm", "rlm", "limma" or "coxph" (Default: "glm").

generate.confounders

Generate variables from the methylation data to adjust for unknown confounders. May be NULL for none (default), "sva" or "smartsva" to generate surrogate variables, or "pca" to generate prinicipal components. If method == "coxph", then generate.confounders must be NULL or "pca".

n.confounders

Number of unknown confounders to generate. A value of NULL allows sva to select the number automatically or if principal components the maximum number (Default: NULL).

most.variable

Generate confounders from the #' given most variable CpG sites rather than the whole matrix (Default: NULL).

random.subset

Generate surrogate variables from the given percentage of randomly selected CpG sites rather than the whole matrix (Default: 0.05, i.e. 5 percent).

...

Arguments to glm, rlm, limma::eBayes, or survival::coxph.

Details

Note: mclapply is used to fit regression models using multiple cores.

Value

List containing a table of association statistics ('table') and the model design matrix ('design').


perishky/ewaff documentation built on Nov. 10, 2024, 4:53 p.m.