title: "tenXplore: ontology for scRNA-seq, applied to 10x 1.3 million neurons"
author: "Vincent J. Carey, stvjc at channing.harvard.edu"
date: "r format(Sys.time(), '%B %d, %Y')
"
vignette: >
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{tenXplore: ontology for scRNA-seq, applied to 10x 1.3 million neurons}
%\VignetteEncoding{UTF-8}
output:
BiocStyle::html_document:
highlight: pygments
number_sections: yes
theme: united
toc: yes
```{r setupp,echo=FALSE,results="hide"} suppressWarnings({ suppressPackageStartupMessages({ library(tenXplore) library(org.Mm.eg.db) library(org.Hs.eg.db) }) })
# Introduction/Executive summary
The tenXplore package includes prototypical code to facilitate
the coding of an ontology-driven visualizer of transcriptomic
patterns in single-cell RNA-seq studies.
![dashsnap](dashboard.png)
This package is intended to illustrate the role of formal
ontology in the analysis of single-cell RNA-seq experiments.
We anticipate that both unsupervised and supervised methods
of statistical learning
will be useful.
- The [10x 1.3 million neuron dataset](https://community.10xgenomics.com/t5/10x-Blog/Our-1-3-million-single-cell-dataset-is-ready-to-download/ba-p/276) primarily
invites unsupervised analysis, as no anatomical or other
structural details are provided along with single-cell expression
measures.
- Other studies that include details on anatomic source
or other non-transcriptomic features of cellular identity
would be amenable to supervised analysis, which will
benefit from standardization of metadata about experimental
factors and anatomic structures.
## A challenge: finding expression signatures of anatomic structures or cell types
Gene Ontology and Gene Ontology Annotation are the primary
resources at present for enumerating gene signatures for
cell types. The `r Biocpkg("ontoProc")` package
assists in providing some mappings, but much work is needed.
```{r doont}
library(ontoProc)
data(allGOterms)
cellTypeToGO("serotonergic neuron", gotab=allGOterms)
cellTypeToGenes("serotonergic neuron", gotab=allGOterms, orgDb=org.Mm.eg.db)
cellTypeToGenes("serotonergic neuron", gotab=allGOterms, orgDb=org.Hs.eg.db)
At this point the API for selecting cell types, bridging to gene sets, and acquiring expression data, is not well-modularized, but extensive activity in these areas in new Bioconductor packages will foster enhancements for this application.
In brief, we often fail to find GO terms that approximately match, as strings, Cell Ontology terms corresponding to cell types and subtypes. Thus the cell type to gene mapping is very spotty and the app has nothing to show.
On the other hand, if we match on cell types, we get very large numbers of matches, which will need to be filtered. We will introduce tools for generating additional type to gene harvesting/filtering in real time.
The r Biocpkg("ontoProc")
package includes facilities
for working with a variety of ontologies, and tenXplore will
evolve to improve flexibility of selections and visualizations
afforded by the r CRANpkg("ontologyIndex")
and r CRANpkg("ontologyPlot")
packages.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.