A dataset containing alternative and reference allele read counts per cell and heterozygous variant, derived from a single-cell RNA-seq dataset of fibroblast and liver cells from crossed CAST/EiJ x C57BL/6J mouse strains. The dataset has been subsetted to 300 genes to restrict its size.
An acset with four elements, featdata, refcount, altcount and
new_acset for a description of these elements.
The acset contains data for 3313 variants and 336 single cells.
Allele counts were generated by alignment of the RNA-seq data to each of the
genomes of the two mouse strains and subsequently running samtools mpileup
using variants that were homozygous within each strain and differed
between the strain-genomes. Variants were filtered on being within RefSeq
genes. Variants were further filtered using the allele count data to not
monoallelically express the same allele across cells (see
filter_homovars) and on having imbalanced allelic expression in
at least 3 cells (see
filter_var_gt. Features were filtered on
having at least two such variants (see
For additional filters and further details on the generation of the allele
count data see the Supplemental Data in Edsgard et al, scphaser: Haplotype
Inference Using Single-Cell RNA-Seq Data, Bioinformatics, 2016.
RNA-seq data can be found at http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75659
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.