PAC_saturation: Filter a PAC object on sequence size and coverage

View source: R/PAC_saturation.R

PAC_saturationR Documentation

Filter a PAC object on sequence size and coverage

Description

PAC_saturation Performs an sequence diversity/saturation analysis on a PAC objects.

Usage

PAC_saturation(PAC, resample = 10, steps = 10, thresh = c(1, 10), threads = 1)

Arguments

PAC

PAC-list object containing a Counts data.frame with sequences as row names and samples as column names.

resample

Integer setting the number of permutations at each percentage step (default=10).

steps

Integer defining the number of percentage steps between 0-100 original dataset (default=10).

thresh

Integer vector containing mean count thresholds that will be targeted. Default is set to c(1,10), where each new occurrence reaching 1 count (>=1) and each new occurrence reaching 10 counts (>=10) will be analyzed.

threads

Number of cores to be used for performing the permutations.

Details

Given a PAC object the function will perform a sequence saturation analysis. This is done by downsampling the original dataset by permutation at different percentages of the original dataset. The closer the curve at the original sequence depth (100 diversity of sequences for the original dataset. Approaching the plateau usually means that the sequencing depth of the library have sampled the full population of sequences available in the sample. Here we use an none-linear least square (nls) model with a self-starter for asymptotic regression (SSasympt) to describe the rate in which the library approaches the plateau.

Value

A list with ggplot2 graph objects: The 1:st graph shows saturation/diversity result at the 1:st threshold. The 2:nd graph shows saturation/diversity result at the 2:nd threshold, etc.

See Also

https://github.com/Danis102 for updates on the current package.

Other PAC analysis: PAC_covplot(), PAC_deseq(), PAC_filter(), PAC_filtsep(), PAC_gtf(), PAC_jitter(), PAC_mapper(), PAC_nbias(), PAC_norm(), PAC_pca(), PAC_pie(), PAC_sizedist(), PAC_stackbar(), PAC_summary(), PAC_trna(), as.PAC(), filtsep_bin(), map_rangetype(), tRNA_class()

Examples



# OBS! The example below is using already down-sampled data. Still, sequence
# diversity is rather saturated on >=1 occurrence. meaning that most sequences
# in the samples has been caught. Nonetheless, sequences reaching >=2
# occurrences have not plateaued.

load(system.file("extdata", "drosophila_sRNA_pac_filt_anno.Rdata", 
                  package = "seqpac", mustWork = TRUE))

plot_lst  <- PAC_saturation(pac, resample=10, steps=10, 
                            thresh=c(1,2), threads=1)
names(plot_lst)
cowplot::plot_grid(plotlist=plot_lst)


Danis102/seqpac documentation built on Aug. 26, 2023, 10:15 a.m.