knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

UMIcountR

Molecular Spikes

For information on obtaining molecular spikes, reference fasta and more, please visit the molecular spikes GitHub repo.

Installation

You can install UMIcountR from GitHub with:

# install.packages("devtools")
devtools::install_github("cziegenhain/UMIcountR")

Example

This is a basic example which shows you how to load and analyse some molecular spikes data:

library(UMIcountR)
## basic example code
#load reads from the provided example bam file (Smart-seq3 data)
bam_path <- system.file("extdata", "Smartseq3.TTACCTGCCAGATTCG.bam", package = "UMIcountR", mustWork = TRUE)

#in the case of the simple v1 molecular spike
spikedat <- extract_spike_dat(bam_path, match_seq_before_UMI = "GAGCCTGGGGGAACAGGTAGG", match_seq_after_UMI = "CTCGGAGGAGAAA")

#in the case of the complex molecular spikes set
data("molspike_barcodes_infos_fivep_final")
#spikedat <- extract_complex_spike_dat(bam_path, bc_df = spike_info, max_pattern_dist = 3)

After loading the data, we can see the data structure:

str(spikedat)

Next, we can run the filtering for overrespresented spUMIs:

overrep <- get_overrepresented_spikes(spikedat, readcutoff = 75)
overrep$plots[[1]]

You could also apply directional-adjacency error correction to the Smart-seq3 UMI in the test data:

spikedat[, UB_directional := return_corrected_umi(UX, editham = 1, collapse_mode = "adjacency_directional"), by = BC]

To downsample the copy number of the molecular spikes in your data to a relevant "expression level" of interest, you can use the following function:

spikedat_mean100 <- subsample_recompute(spikedat, mu_nSpikeUMI = 100, threads = 4)

Reference

Molecular spikes: a gold standard for single-cell RNA counting https://www.nature.com/articles/s41592-022-01446-x



cziegenhain/UMIcountR documentation built on May 30, 2022, 5:38 p.m.