FacileBiocDataStore-class: Immerse bioconductor assay containers into the facile.bio...

FacileBiocDataStore-classR Documentation

Immerse bioconductor assay containers into the facile.bio ecosystem.

Description

Bioconductor assay containers, like a DGEList, DESeqDataSet, SummarizedExperiment, etc. can be used within the facie.bio ecosystem by invoking the facilitate() method on them. This will return a ⁠Facile*⁠ subclass of the container itself.

Usage

## S3 method for class 'DESeqDataSet'
facilitate(
  x,
  assay_type = "rnaseq",
  feature_type = "infer",
  organism = "unknown",
  ...,
  run_vst = NULL,
  blind = TRUE,
  nsub = 1000,
  fitType = "parametric",
  verbose = FALSE
)

## S3 method for class 'DGEList'
facilitate(
  x,
  assay_type = "rnaseq",
  feature_type = "infer",
  organism = "unknown",
  ...
)

Arguments

assay_type

A string that indicates the type of assay stored in the primary assay of the container. For some assay containers, like DESeqDataSet, DGEList, and SingleCellExperiment, we can assume the default value for this to be "rnaseq". For the rest, we assume it's "lognorm".

feature_type

A string that indicates the type of features identifiers the assay containers is using. Default is "infer" to try to guess, but this is not the most accurate.

organism

the organism the dataset is for (Homo sapiens, Mus musculus, etc.)

run_vst

should we re-run the vst transformation for a DESeqDataSet. If the DESeqDataSet already has a "vst" assay, then we'll just take that, otherwise if this isn't set to FALSE it will be run.

blind, nsub, fitType

parameters to send to DESeq2::vst() to tweak how it is run internally

verbose

make some noise

Details

For instance, facilitate(DGEList) will return a FacileDGEList, which can be used a "normal" DGEList in all the same ways, but is also wrapped with the facile api api and can be used by methods withing the FacileAnalysis, for instance.

These classes are also all subclass of the abstract FacileBiocDataStore virtual class.

DESeqDataSet

When a DESeqDataSet object is facilitate()'d, a normalized count matrix will be calculated using DESeq2::counts(x, normalized = TRUE) and stored as a matrix named "normcounts" in its assays() list. These are the values that are returned by (fetch|with)_assay_data when normalized = TRUE, which differs from the edgeR:cpm normalized count data which is usually returned from most every other expression container.

By default, these normalized counts will be log2 transformed when returned to conform to the expectation in the facilebio ecosystem. To get the deault DESeq2 behaviour, the user would use fetch_assay_data(.., normalized = TRUE, log = FALSE).

This function will also look for variance stabilized versions of the data in the "vst" and "rlog" assay matrices. If no "vst" assay is present, it will be run and stored there, unless the ⁠facilitate,run_vst⁠ parameter is set to FALSE. This data can be returned using assay_name = "vst"

DGEList

We assume the DGEList holds "rnaseq" assay data. Set the assay_type parameter if that's not the case.

Examples

# DESeq2 --------------------------------------------------------------------
dds <- DESeq2::makeExampleDESeqDataSet(n=2000, m=20)
fd <- facilitate(dds)
fetch_assay_data(samples(fd), c("gene1", "gene20"))
fetch_assay_data(samples(fd), c("gene1", "gene20"), normalized = TRUE)

samples(fd) |>
  with_assay_data(c("gene1", "gene20"), normalized = TRUE)

# Retrieiving different flavors of normalized expression data
dat <- samples(fd) |>
  with_assay_data("gene1", normalized = TRUE) |>
  with_assay_data("gene1", assay_name = "vst") |>
  select(-(1:2))
colnames(dat) <- c("normcounts", "vst")
pairs(dat)

## Not run: 
dpca <- FacileAnalysis::fpca(fd, assay_name = "vst")
FacileAnalysis::shine(dpca)

## End(Not run)
# edgeR ---------------------------------------------------------------------
y <- example_bioc_data(class = "DGEList")
yf <- facilitate(y)
FacileAnalysis::fpca(yf)

facilebio/FacileBiocData documentation built on Sept. 21, 2023, 2:18 p.m.