library(BiocStyle) knitr::opts_chunk$set(error=FALSE, message=FALSE, warning=FALSE)
We obtain a single-cell RNA sequencing dataset of human pancreas from Xin et al. (2016).
A matrix of RPKMs is provided in the Gene Expression Omnibus
under the accession GSE81608.
We download it using r Biocpkg("BiocFileCache")
to cache the results:
library(BiocFileCache) bfc <- BiocFileCache("raw_data", ask=FALSE) rpkm.txt <- bfcrpath(bfc, "ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE81nnn/GSE81608/suppl/GSE81608_human_islets_rpkm.txt.gz")
We read the RPKMs into memory as a sparse matrix.
library(scuttle) mat <- readSparseCounts(rpkm.txt) dim(mat)
We download the metadata, which was supplied by the authors to Vladimir Kiselev, Tallulah Andrews and Martin Hemberg. Annoyingly, our original source of this file is no longer available, so we'll have to load the copy from ExperimentHub.
library(ExperimentHub) ehub <- ExperimentHub() coldata <- ehub[["EH2700"]] transformed <- sub("_", " ", colnames(mat)) stopifnot(identical(transformed, coldata$Sample.name)) # check consistency coldata$Sample.name <- NULL # mostly redundant and removed. coldata
We do the same for the row metadata.
rowdata <- ehub[["EH2699"]] stopifnot(identical(rownames(mat), as.character(rowdata[,1]))) rowdata <- rowdata[,-1,drop=FALSE] # redundant and removed. rowdata
Slapping everything together into a SingleCellExperiment
:
library(SingleCellExperiment) sce <- SingleCellExperiment(list(rpkm=mat), colData=coldata, rowData=rowdata)
Adding some polish to optimize for disk space:
library(scRNAseq) sce <- polishDataset(sce) sce
Now saving it to disk:
meta <- list( title="RNA Sequencing of Single Human Islet Cells Reveals Type 2 Diabetes Genes", description="Pancreatic islet cells are critical for maintaining normal blood glucose levels, and their malfunction underlies diabetes development and progression. We used single-cell RNA sequencing to determine the transcriptomes of 1,492 human pancreatic α, β, δ, and PP cells from non-diabetic and type 2 diabetes organ donors. We identified cell-type-specific genes and pathways as well as 245 genes with disturbed expression in type 2 diabetes. Importantly, 92% of the genes have not previously been associated with islet cell function or growth. Comparison of gene profiles in mouse and human α and β cells revealed species-specific expression. All data are available for online browsing and download and will hopefully serve as a resource for the islet research community.", taxonomy_id="9606", genome="GRCh37", sources=list( list(provider="GEO", id="GSE81608"), list(provider="PubMed", id="27667665"), list(provider="ExperimentHub", id="EH2700"), list(provider="ExperimentHub", id="EH2699") ), maintainer_name="Aaron Lun", maintainer_email="infinite.monkeys.with.keyboards@gmail.com" ) saveDataset(sce, "2023-12-19_output", meta)
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.