biocbox: Construct a bioconductor classed object from an analysis.

biocbox.FacileLinearModelDefinitionR Documentation

Construct a bioconductor classed object from an analysis.

Description

Construct a bioconductor classed object from an analysis.

Usage

## S3 method for class 'FacileLinearModelDefinition'
biocbox(
  x,
  assay_name = NULL,
  method = NULL,
  features = NULL,
  filter = "default",
  filter_universe = NULL,
  filter_require = NULL,
  with_sample_weights = FALSE,
  weights = NULL,
  block = NULL,
  prior_count = 0.1,
  ...
)

## S3 method for class 'FacileDgeAnalysisResult'
biocbox(x, cached = TRUE, ...)

Arguments

assay_name

the name of the assay to pull data for

method

the name of the dge method that will be used. This will dictate the post-processing of the data

filter

A filtering policy to remove unintereesting genes. If "default" (which is the default), then edgeR::filterByExpr() is used if we are materializing a DGEList, otherwise lowly expressed features are removed in a similarly "naive" manner. This can, alternatively, be a character vector that holds the names of the features that should be kept. Default value: "default".

with_sample_weights

Some methods that leverage the limma pipeline, like "voom", "limma", and "limma-trend" can leverage sample (array) quality weights to downweight outlier samples. In the case of method == "voom", we use limma::voomWithQualityWeights(), while the rest use limma::arrayWeights(). The choice of method determines which sample weighting function to sue. Defaults to FALSE.

prior_count

The pseudo-count to add to count data. Used primarily when running the limma-trend method on count (RNA-seq) data.

...

passed down to internal modeling and filtering functions.

sample_info

a facile_frame that enumerates the samples to fetch data for, as well as the covariates used in downstream analysis

Value

a DGEList or EList with assay data in the correct place, and all of the covariates in the ⁠$samples⁠ or ⁠$targerts⁠ data.frame that are requied to test the model in mdef.

Linear Model Definitions

This function accepts a model defined using using flm_def() and creates the appropriate Bioconductor assay container to test the model given the assay_name and dge method specified by the user.

This function currently supports retrieving data and whipping it into a DGEList (for count-like data) and an EList for data that can be analyzed with one form limma or another.

Assumptions on different assay_type values include:

  • rnaseq: assumed to be "vanilla" bulk rnaseq gene counts

  • umi: data from bulk rnaseq, UMI data, like quantseq

  • tpm: TPM values. These will be log2(TPM + prior_count) transformed, then differentially tested using the limma-trended pipeline

TODO: support affymrna, affymirna, etc. assay types

The "filter" parameters are described in the fdge() function for now.

FacileDgeAnalysisResult

Given a FacileDgeAnalysisResult, we can re-materialize the Bioconductor assay container used within the differential testing pipeline used from fdge(). Currently we have limited our analysis framework to either work over DGEList (edgeR) or EList (limma) containers.


facilebio/FacileAnalysis documentation built on Sept. 26, 2024, 5:13 a.m.