library(BiocStyle)
knitr::opts_chunk$set(error=FALSE, message=FALSE, warning=FALSE)

Download the data

We obtain a single-nucleus RNA sequencing dataset of mouse brains from @hu2017dissecting. Counts for endogenous genes and antibody-derived tags are available from the Gene Expression Omnibus using the accession number GSE106678.

library(BiocFileCache)
bfc <- BiocFileCache("raw_data", ask = FALSE)
tarred <- bfcrpath(bfc, "https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE106678&format=file")

temp <- tempfile()
dir.create(temp)
untar(tarred, exdir=temp)
list.files(temp)

Processing the data

We load in each of the files.

library(scuttle)
counts <- list()
for (x in list.files(temp, full.names=TRUE)) {
    prefix <- sub("^[^_]*_", "", x)
    prefix <- sub("_.*", "", prefix)
    counts[[prefix]] <- readSparseCounts(x)
}
do.call(rbind, lapply(counts, dim))

For some unknown reason, each matrix has its own set of features! Crazy. At least the intersection is of a reasonable size.

length(Reduce(intersect, lapply(counts, rownames)))

Save for upload

We now save all of the relevant components to file for upload to r Biocpkg("ExperimentHub").

repath <- file.path("scRNAseq", "hu-cortex", "2.4.0")
dir.create(repath, showWarnings=FALSE, recursive=TRUE)
for (x in names(counts)) {
    saveRDS(counts[[x]], file=file.path(repath, sprintf("counts-%s.rds", x)))
}

Session info

sessionInfo()

References



drisso/scRNAseq documentation built on Feb. 16, 2021, 1:18 a.m.