bcbioSingleCell: 'bcbioSingleCell' Object and Constructor

Description Usage Arguments Value Remote Data Note Author(s) See Also Examples

Description

bcbioSingleCell is an S4 class that extends SingleCellExperiment, and is designed to store a bcbio single-cell RNA-seq analysis. This class contains read counts saved as a sparse matrix (dgCMatrix), sample metadata, and cell quality control metrics.

Usage

1
2
3
4
bcbioSingleCell(uploadDir, organism = NULL, sampleMetadataFile = NULL,
  interestingGroups = "sampleName", ensemblRelease = NULL,
  genomeBuild = NULL, transgeneNames = NULL, spikeNames = NULL,
  gffFile = NULL, ...)

Arguments

uploadDir

Path to final upload directory. This path is set when running "bcbio_nextgen -w template".

organism

Organism name. Use the full latin name (e.g. "Homo sapiens"), since this will be input downstream to AnnotationHub and ensembldb, unless gffFile is set.

sampleMetadataFile

Sample barcode metadata file. Optional for runs with demultiplixed index barcodes (e.g. SureCell), but otherwise required for runs with multipliexed FASTQs containing multiple index barcodes (e.g. inDrop).

interestingGroups

Character vector of interesting groups. Must be formatted in camel case and intersect with sampleData() colnames.

ensemblRelease

Optional. Ensembl release version. If NULL, defaults to current release, and does not typically need to be user-defined. Passed to AnnotationHub for EnsDb annotation matching, unless gffFile is set.

genomeBuild

Optional. Ensembl genome build name (e.g. "GRCh38"). This will be passed to AnnotationHub for EnsDb annotation matching, unless gffFile is set.

transgeneNames

character vector indicating which assay() rows denote transgenes (e.g. EGFP, TDTOMATO).

spikeNames

character vector indicating which assay() rows denote spike-in sequences (e.g. ERCCs).

gffFile

Optional, not recommended. By default, we recommend leaving this NULL for genomes that are supported on Ensembl. In this case, the row annotations (rowRanges()) will be obtained automatically from Ensembl by passing the organism, genomeBuild, and ensemblRelease arguments to AnnotationHub and ensembldb. For a genome that is not supported on Ensembl and/or AnnotationHub, a GFF/GTF (General Feature Format) file is required. Generally, we recommend using a GTF (GFFv2) file here over a GFF3 file if possible, although all GFF formats are supported. The function will internally generate a TxDb containing transcript-to-gene mappings and construct a GRanges object containing the genomic ranges (rowRanges()).

...

Additional arguments, to be stashed in the metadata() slot.

Value

bcbioSingleCell.

Remote Data

When working in RStudio, we recommend connecting to the bcbio-nextgen run directory as a remote connection over sshfs.

Note

bcbioSingleCell extended SummarizedExperiment prior to v0.1.0, where we migrated to SingleCellExperiment.

Author(s)

Michael Steinbaugh, Rory Kirchner

See Also

Other S4 Object: coerce, extract, show, updateObject

Examples

1
2
3
4
5
6
7
8
9
uploadDir <- system.file("extdata/indrops", package = "bcbioSingleCell")
x <- bcbioSingleCell(
    uploadDir = uploadDir,
    organism = "Homo sapiens",
    sampleMetadataFile = file.path(uploadDir, "metadata.csv"),
    ensemblRelease = 87L
)
show(x)
validObject(x)

WeiSong-bio/roryk-bcbioSinglecell documentation built on July 6, 2019, 12:03 a.m.