getLimmaInput.DF: Fit a linear model to each nucleotide

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/getLimmaInput.DF.R

Description

Fits the linear model log2(count+scalefac) = beta0 + beta1*group + beta2*library.size + [optional confounders] to each nucleotide. From these models, this function constructs an object which can be directly passed to getTstats to obtain limma's moderated t statistics for each nucleotide, which we use as a measure of strength of association between group and count (expression). Reads coverage from a DataFrame which can be generated using makeDF and relies heavily on the limma package, using lmFit as the main workhorse.

Usage

1
2
3
getLimmaInput.DF(DF, comparison = c("twogroup", "multigroup", "expression"),
  group, chunksize = 1e+05, adjustvars = NULL, colsubset = NULL,
  scalefac = 32, nonzero = FALSE)

Arguments

DF

A list containing a DF DataFrame with the coverage data and a logical Rle with the positions that passed the cutoff. This object is generated using makeDF.

comparison

Either twogroup, multigroup or expression. multigroup will use the F-statistic and expression tests the intercept-only model.

group

a 0/1 vector grouping the samples (columns) in the database.

chunksize

How many rows of the merged table should be processed at a time?

adjustvars

Optional matrix of adjustment variables (e.g. measured confounders, output from SVA, etc.) to use in fitting linear models to each nucleotide.

colsubset

Optional vector of column indices of the input file that denote samples you wish to include in analysis.

scalefac

A log transformation is used on the count tables, so zero counts present a problem. What number should we add to the entire matrix before running the models? Defaults to 32.

nonzero

If TRUE, use the median of only the nonzero counts as the library size adjustment.

Details

Note adjustvars must have the same number of rows as group has entries. Larger values of chunksize require more memory; smaller values of chunksize require more computation time.

Value

A list with elements

ebobject

A list of five vectors (coefficients, stdev.unscaled, sigma, df.residual, and Amean), mimicking the MArrayLM class in limma. Here, coefficients and stdev.unscaled are only returned for beta1, the coefficient for group, as it is assumed this is the only covariate of interest.

pos

A vector of the same length as those contained in ebobject, giving the genomic positions of each linear model.

Author(s)

Alyssa Frazee, Leonardo Collado-Torres

References

Smyth G (2004). "Linear models and empirical Bayes methods for assessing differential expression in microarray experiments." Statistical Applications in Genetics and Molecular Biology 3(1): Article 3.

See Also

getTstats, makeDF

Examples

1
## add example here when we have a vignette

leekgroup/derfinder documentation built on May 20, 2019, 11:30 p.m.