knitr::opts_chunk$set(eval = FALSE) knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
In this tutorial, we are going to utilize 5' scRNA-seq data on epigenetically de-repressed cancer cell lines to quantify transposable element (TE) expression levels at single-cell and locus resolution. Following the workflow, you'll learn the specifics of Repsc to adapt it to your single-cell dataset.
We start the workflow by loading Repsc and the human hg38 BSgenome object into our R environment:
Sys.time() devtools::load_all('/net/mraid14/export/tgdata/users/davidbr/src/Repsc/') devtools::load_all('/net/mraid14/export/tgdata/users/davidbr/src/Reputils/') devtools::load_all('/net/mraid14/export/tgdata/users/davidbr/src/Repdata/') library(BSgenome.Mmusculus.UCSC.mm10)
We then import our gene and TE annotation files as GRanges objects followed by Repsc-specific curation and formatting using the curateGenes
and curateTEs
functions.
# path to Gencode gtf file (provided) gene_path <- system.file(package = 'Repdata', 'extdata', 'mm10', 'genes', 'gencode.vM22.annotation.gtf.gz') # path to RepeatMasker mm10 repeat annotation (provided) rmsk_path <- system.file(package = 'Repdata', 'extdata', 'mm10', 'tes', 'mm10.fa.out.gz') # creating the scSet sc <- createScSet(genome = Mmusculus, protocol = 'fiveprime', tes = rmsk_path, genes = gene_path)
After we have imported and curated the data, we can procede to generate the actual read/UMI count matrix using the compCounts
function.
# path to bam files containing mapped reads bam_paths <- dir('/net/mraid14/export/data/users/davidbr/proj/eseb/data/', recursive = TRUE, pattern = '_deduplicated_chr[0-9, X, Y, MT]+.bam', full.names = TRUE) hdf5_paths <- dir('/net/mraid14/export/data/users/davidbr/proj/eseb/data/', recursive = TRUE, pattern = 'filtered_gene_bc_matrices_h5.h5', full.names = TRUE) # create a data.frame specifying import parameters input_df <- data.frame(paths = bam_paths, hdf5 = rep(hdf5_paths, each = 22), paired = TRUE, # use FALSE for single-end libraries mate = 'first', # only imports the first mate of properly aligned read pairs, set to NA when using single-end libraries barcode = 'CB', # 10x barcode included in BAM flag chunk = ntile(bam_paths, 40), meta = gsub('/', '', substring(bam_paths, 56, 61)), stringsAsFactors = FALSE) head(input_df)
sc <- addCounts(sc, bams = input_df, bin_size = 25, msa_dir = NULL, use_gcluster = TRUE)
To distinguish real cells from empty droplets, we utilize the emptyDrops
function from the DropletUtils package[1].
plotCells(sc)
plotMapping(sc)
sc <- selectPeaks(sc) plotPeaks(sc)
sc <- selectFeatures(sc) plotFeatures(sc)
[1]
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.