Description Usage Arguments Details Value Author(s) See Also
Process raw 10X scRNA-seq data and generate UMI counts for each gene in each cell.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | cellCounts(
# input data
index,
sample.index,
input.mode = "BCL",
cell.barcode = NULL,
# specify the aligner used for read mapping
aligner = "align",
# parameters used by featureCounts for assigning and counting UMIs
annot.inbuilt = "mm10",
annot.ext = NULL,
isGTFAnnotationFile = FALSE,
GTF.featureType = "exon",
GTF.attrType = "gene_id",
useMetaFeatures = TRUE,
# number of threads
nthreads = 10,
# other parameters passed to align, subjunc and featureCounts functions
...)
|
index |
A character string providing the base name of index files created for a reference genome by the |
sample.index |
A data frame containing index set name for each sample and other sample-related information. The data frame must contain four columns with column headers named |
input.mode |
Specify the input mode. Currently only the BCL-format input is supported ( |
cell.barcode |
A character string giving the name of a text file (can be gzipped) that contains the set of cell barcodes used in sample preparation. If |
aligner |
Specify the name of the aligner used for read mapping. Currently it has only one possible value |
annot.inbuilt |
Specify an inbuilt annotation for UMI counting. See |
annot.ext |
Specify an external annotation for UMI counting. See |
isGTFAnnotationFile |
See |
GTF.featureType |
See |
GTF.attrType |
See |
useMetaFeatures |
Specify if UMI counting should be carried out at the meta-feature level (eg. gene level). See |
nthreads |
A numeric value giving the number of threads used for read mapping and counting. |
... |
other parameters passed to |
The cellCounts
function takes as input raw scRNA-seq read data generated from the 10X Genomics platform.
It utilizes the read mapping and counting functions included in the Rsubread package to process the scRNA-seq data.
It calls the align
function to map reads to a reference genome and calls the featureCounts
function to assign reads to genes.
It performs sample demultiplexing, cell barcode demultiplexing and read deduplication before producing UMI counts for each gene in each cell.
The cellCounts
function is able to process multiple datasets stored in multiple different directories at the same time.
Sample-related information should be provided to the sample.index
parameter.
This includes the name of index set used for each sample, sample name, the flowcell lane used for the sequencing of each sample and the location where the sample data were saved.
All these information should be stored in a data.frame
object, which can then be provided to the sample.index
parameter.
Below is an example of the data.frame
object provided to sample.index
:
1 2 3 4 5 6 7 8 9 10 | InputDirectory Lane SampleName IndexSetName
/path/to/dataset1 1 Sample1 SI-GA-E1
/path/to/dataset1 1 Sample2 SI-GA-E2
/path/to/dataset1 2 Sample1 SI-GA-E1
/path/to/dataset1 2 Sample2 SI-GA-E2
/path/to/dataset2 1 Sample3 SI-GA-E3
/path/to/dataset2 1 Sample4 SI-GA-E4
/path/to/dataset2 2 Sample3 SI-GA-E3
/path/to/dataset2 2 Sample4 SI-GA-E4
...
|
The cellCounts
function returns a List
object to R, and it also outputs three gzipped FASTQ files and one BAM file for each sample.
The three gzipped FASTQ files include cell barcode and UMI sequences (R1), sample index sequences (I1) and the actual genomic sequences of the reads (R2), respectively.
The BAM file includes location-sorted read mapping results.
The returned List
object contains the following components:
counts |
a |
annotation |
a |
sample.info |
a |
cell.confidence |
a |
Yang Liao and Wei Shi
buildindex
, align
, featureCounts
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.