library(BiocStyle) knitr::opts_chunk$set(error=FALSE, message=FALSE, warning=FALSE)
We obtain a single-cell RNA sequencing dataset of the mouse retina from Macosko et al. (2015).
Counts for endogenous genes and spike-in transcripts are available from the Gene Expression Omnibus
using the accession number GSE63472.
We download and cache them using the r Biocpkg("BiocFileCache")
package.
library(BiocFileCache) bfc <- BiocFileCache("raw_data", ask = FALSE) base.url <- "ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE63nnn/GSE63472/suppl/" count.file <- bfcrpath(bfc, paste0(base.url, "GSE63472_P14Retina_merged_digital_expression.txt.gz"))
We load them into memory.
library(scuttle) counts <- readSparseCounts(count.file) dim(counts)
We also download a file containing the metadata for each cell.
meta.url <- "http://mccarrolllab.com/wp-content/uploads/2015/05/retina_clusteridentities.txt" meta.file <- bfcrpath(bfc, meta.url) coldata <- read.delim(meta.file, stringsAsFactors=FALSE, header=FALSE) colnames(coldata) <- c("cell.id", "cluster") library(S4Vectors) coldata <- as(coldata, "DataFrame") coldata
We match the metadata to the columns.
m <- match(colnames(counts), coldata$cell.id) coldata <- coldata[m,] rownames(coldata) <- colnames(counts) coldata$cell.id <- NULL # redundant and unnecessary summary(is.na(m))
We now assemble the components into a SingleCellExperiment
, with some polishing to save disk space.
library(scRNAseq) sce <- SingleCellExperiment(list(counts=counts), colData=coldata) sce <- polishDataset(sce) sce
This is saved in preparation for upload:
meta <- list( title="Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets", description="Cells, the basic units of biological structure and function, vary broadly in type and state. Single-cell genomics can characterize cell identity and function, but limitations of ease and scale have prevented its broad application. Here we describe Drop-seq, a strategy for quickly profiling thousands of individual cells by separating them into nanoliter-sized aqueous droplets, associating a different barcode with each cell's RNAs, and sequencing them all together. Drop-seq analyzes mRNA transcripts from thousands of individual cells simultaneously while remembering transcripts' cell of origin. We analyzed transcriptomes from 44,808 mouse retinal cells and identified 39 transcriptionally distinct cell populations, creating a molecular atlas of gene expression for known retinal cell classes and novel candidate cell subtypes. Drop-seq will accelerate biological discovery by enabling routine transcriptional profiling at single-cell resolution.", taxonomy_id="10090", genome="GRCm38", sources=list( list(provider="GEO", id="GSE63472"), list(provider="PubMed", id="26000488"), list(provider="URL", id=meta.url, version=as.character(Sys.Date())) ), maintainer_name="Aaron Lun", maintainer_email="infinite.monkeys.with.keyboards@gmail.com" ) saveDataset(sce, "2023-12-19_output", meta)
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.