DMLfit.multiFactor: Fit a linear model for BS-seq data from general experimental...

View source: R/DML.multiFactor.R

DMLfit.multiFactorR Documentation

Fit a linear model for BS-seq data from general experimental design

Description

This function takes a BSseq object, a data frame for experimental design and a model formula and then fit a linear model.

Usage

DMLfit.multiFactor(BSobj, design, formula, smoothing=FALSE, smoothing.span=500)

Arguments

BSobj

An object of BSseq class for the BS-seq data.

design

A data frame for experimental design. Number of rows must match the number of columns of the counts in BSobj.

formula

A formula for the linear model.

smoothing

A flag to indicate whether to apply smoothing. When true, the counts will be smoothed by a simple moving average method.

smoothing.span

The size of smoothing window, in basepairs. Default is 500.

Details

The lineear model fitting is done through ordinary least square on the arscine transformed methylation percentages. The estimated standard errors are computed with consideration of the data (count) distribution and transformation. This function is extremely efficient. The computation takes around 20 minutes for 4 million CpG sites.

Value

A list with following components

gr

An object of 'GRanges' for locations of the CpG sites.

design

The input data frame for experimental design.

formula

The input formula for the model.

X

The design matrix used in regression. It is created based on design and formula.

fit

The model fitting results. This is a list itself, with three components: 'beta' - the estimated coefficients; 'var.beta' - estimated variance/covariance matrices for beta. 'phi' - estimated beta-binomial dispersion parameters. Note that var.beta for a CpG site should be a ncol(X) x ncol(X) matrix, but is flattend to a vector so that the matrices for all CpG sites can be saved as a matrix.

Author(s)

Hao Wu<hao.wu@emory.edu>

See Also

DMLtest.multiFactor, DMLtest

Examples

## Not run: 
data(RRBS)
## model fitting
DMLfit = DMLfit.multiFactor(RRBS, design, ~case+cell+case:cell)

## with smoothing:
DMLfit.sm = DMLfit.multiFactor(RRBS, design, ~case+cell+case:cell, smoothing=TRUE)

## hypothesis testing
DMLtest.cell = DMLtest.multiFactor(DMLfit, coef=3)

## look at distributions of test statistics and p-values
par(mfrow=c(1,2))
hist(DMLtest.cell$stat, 100, main="test statistics")
hist(DMLtest.cell$pvals, 100, main="P values")

## End(Not run)

haowulab/DSS documentation built on Oct. 28, 2023, 6:59 p.m.