readCellRanger: Read 10X Genomics Cell Ranger Data

Description Usage Arguments Value Directory structure Sample metadata Reference data Author(s) See Also Examples

View source: R/readCellRanger.R

Description

Read 10x Genomics Chromium cell counts from barcodes.tsv, genes.tsv, and matrix.mtx files.

Usage

1
2
3
readCellRanger(uploadDir, sampleMetadataFile = NULL, refdataDir = NULL,
  interestingGroups = "sampleName", transgeneNames = NULL,
  spikeNames = NULL, ...)

Arguments

uploadDir

Path to Cell Ranger output directory. This directory path must contain filtered_gene_bc_matrices* as a child directory.

sampleMetadataFile

Sample barcode metadata file. Optional for runs with demultiplixed index barcodes (e.g. SureCell), but otherwise required for runs with multipliexed FASTQs containing multiple index barcodes (e.g. inDrop).

refdataDir

Directory path to Cell Ranger reference annotation data.

interestingGroups

Character vector of interesting groups. Must be formatted in camel case and intersect with sampleData() colnames.

transgeneNames

character vector indicating which assay() rows denote transgenes (e.g. EGFP, TDTOMATO).

spikeNames

character vector indicating which assay() rows denote spike-in sequences (e.g. ERCCs).

...

Additional arguments, to be stashed in the metadata() slot.

Value

SingleCellExperiment.

Directory structure

Cell Ranger can vary in its output directory structure, but we're requiring a single, consistent data structure for datasets containing multiple samples. Note that Cell Ranger data may not always contain per sample subdirectories, or the "outs" subdirectory. We may make this more flexible in the future, but for now we're making this strict to ensure reproducibility.

1
2
3
4
5
6
7
8
9
file.path(
    "<uploadDir>",
    "<sampleName>",
    "outs",
    "filtered_gene_bc_matrices*",
    "outs",
    "<genomeBuild>",
    "matrix.mtx"
)

Sample metadata

A user-supplied sample metadata file defined by sampleMetadataFile is required for multiplexed datasets. Otherwise this can be left NULL, and minimal sample data will be used, based on the directory names.

Reference data

We strongly recommend supplying the corresponding reference data required for Cell Ranger with the refdataDir argument. When set, the function will detect the organism, ensemblRelease, and genomeBuild automatically, based on the 10X refdataDir YAML metadata. Additionally, it will convert the gene annotations defined in the GTF file into a GRanges object, which get slotted in rowRanges(). Otherwise, the function will attempt to use the most current annotations available from Ensembl, and some gene IDs may not match, due to deprecation in the current Ensembl release.

Author(s)

Michael Steinbaugh

See Also

Other Read Functions: readCellTypeMarkers

Examples

1
2
3
uploadDir <- system.file("extdata/cellranger", package = "bcbioSingleCell")
x <- readCellRanger(uploadDir)
show(x)

roryk/bcbioSinglecell documentation built on May 27, 2019, 10:44 p.m.