ReprocessedAllenData | R Documentation |
Obtain the legacy count matrices for three publicly available single-cell RNA-seq datasets. Raw sequencing data were downloaded from NCBI's SRA or from EBI's ArrayExpress, aligned to the relevant genome build and used to quantify gene expression.
ReprocessedAllenData(
assays = NULL,
ensembl = FALSE,
location = TRUE,
legacy = FALSE
)
ReprocessedTh2Data(
assays = NULL,
ensembl = FALSE,
location = TRUE,
legacy = FALSE
)
ReprocessedFluidigmData(
assays = NULL,
ensembl = FALSE,
location = TRUE,
legacy = FALSE
)
assays |
Character vector specifying one or more assays to return.
Choices are |
ensembl |
Logical scalar indicating whether the output row names should contain Ensembl identifiers. |
location |
Logical scalar indicating whether genomic coordinates should be returned. |
legacy |
Logical scalar indicating whether to pull data from ExperimentHub. By default, we use data from the gypsum backend. |
ReprocessedFluidigmData
returns a dataset of 65 human neural cells from Pollen et al. (2014),
each sequenced at high and low coverage (SRA accession SRP041736).
ReprocessedTh2Data
returns a dataset of 96 mouse T helper cells from Mahata et al. (2014),
obtained from ArrayExpress accession E-MTAB-2512.
Spike-in counts are stored in the "ERCC"
entry of the altExps
.
ReprocessedAllenData
return a dataset of 379 mouse brain cells from Tasic et al. (2016).
This is a re-processed subset of the data from TasicBrainData
,
and contains spike-in information stored as in the altExps
.
In each dataset, the first columns of the colData
are sample quality metrics from FastQC and Picard.
The remaining fields were obtained from the original study in their GEO/SRA submission
and/or as Supplementary files in the associated publication.
These two categories of colData
are distinguished by a which_qc
element in the metadata
,
which contains the names of the quality-related columns in each object.
If ensembl=TRUE
, the gene symbols are converted to Ensembl IDs in the row names of the output object.
Rows with missing Ensembl IDs are discarded, and only the first occurrence of duplicated IDs is retained.
If location=TRUE
, the coordinates of the Ensembl gene models are stored in the rowRanges
of the output.
Note that this is only performed if ensembl=TRUE
.
All data are downloaded from ExperimentHub and cached for local re-use.
Specific resources can be retrieved by searching for scRNAseq/legacy-allen
,
scRNAseq/legacy-fluidigm
or scRNAseq/legacy-th2
.
A SingleCellExperiment object containing one or more expression matrices of counts and/or TPMs,
depending on assays
.
FASTQ files were either obtained directly from ArrayExpress, or converted from SRA files (downloaded from the Sequence Read Archive) using the SRA Toolkit.
Reads were aligned with TopHat (v. 2.0.11) to the appropriate reference genome (GRCh38 for human samples, GRCm38 for mouse). RefSeq mouse gene annotation (GCF_000001635.23_GRCm38.p3) was downloaded from NCBI on Dec. 28, 2014. RefSeq human gene annotation (GCF_000001405.28) was downloaded from NCBI on Jun. 22, 2015.
featureCounts (v. 1.4.6-p3) was used to compute gene-level read counts. Cufflinks (v. 2.2.0) was used to compute gene-leve FPKMs. Reads were also mapped to the transcriptome using RSEM (v. 1.2.19) to compute read counts and TPM's.
FastQC (v. 0.10.1) and Picard (v. 1.128) were used to compute sample quality control (QC) metrics. However, no filtering on the QC metrics has been performed for any dataset.
Pollen AA et al. (2014). Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex. Nat. Biotechnol. 32(10), 1053-8.
Mahata B et al. (2014). Single-cell RNA sequencing reveals T helper cells synthesizing steroids de novo to contribute to immune homeostasis. Cell Rep, 7(4), 1130-42.
Tasic A et al. (2016). Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19(2), 335-46.
sce <- ReprocessedAllenData()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.