PAC_filter: Filter a PAC object on sequence size and coverage

View source: R/PAC_filter.R

PAC_filterR Documentation

Filter a PAC object on sequence size and coverage

Description

PAC_filter Filter PAC objects.

Usage

PAC_filter(
  PAC,
  size = NULL,
  threshold = 0,
  coverage = 0,
  norm = "counts",
  subset_only = FALSE,
  stat = FALSE,
  pheno_target = NULL,
  anno_target = NULL
)

Arguments

PAC

PAC-list object containing an Anno data.frame with sequences as row names and a Counts table with raw counts or counts per million (cpm).

size

Integer vector giving the size interval, as c(min,max), that should be saved (default=c(min,max)).

threshold

Integer giving the threshold in counts or normalized counts that needs to be reached for a sequence to be included (default=0).

coverage

Integer giving the percent of independent samples that need to reach the threshold for a sequence to be included (default=0).

norm

Character specifying if filtering should be done using "counts", "cpm" or another normalized data table in PAC$norm (default="counts").

subset_only

Logical whether only subsetting using pheno_target and/or anno_target should be done. If subset=FALSE (default) both subsetting and other filtering will be done.

stat

(optional) Logical specifying if a coverage graph should be generated and if users should be prompted prior to proceeding. (default=FALSE).

pheno_target

(optional) List with: 1st object being a character vector of target column in Pheno, 2nd object being a character vector of the target group(s) in the target Pheno column (1st object). (default=NULL)

anno_target

(optional) List with: 1st object being a character vector of target column in Anno, 2nd object being a character vector of the target type/biotypes(s) in the target Anno column (1st object). (default=NULL)

Details

Given a PAC object the function will extract sequences within a given size interval and percent coverage across independent samples.

Value

A list of objects: PAC object with filtered data. (optional) A coverage plot

See Also

https://github.com/Danis102 for updates on the current package.

Other PAC analysis: PAC_covplot(), PAC_deseq(), PAC_filtsep(), PAC_gtf(), PAC_jitter(), PAC_mapper(), PAC_nbias(), PAC_norm(), PAC_pca(), PAC_pie(), PAC_saturation(), PAC_sizedist(), PAC_stackbar(), PAC_summary(), PAC_trna(), as.PAC(), filtsep_bin(), map_rangetype(), tRNA_class()

Examples

load(system.file("extdata", "drosophila_sRNA_pac_filt_anno.Rdata", 
                 package = "seqpac", mustWork = TRUE))

###--------------------------------------------------------------------- 
## Extracts all sequences between 10-80 nt in length with at 
## least 5 counts in 20% of all samples.
pac_lowfilt <- PAC_filter(pac, size=c(10,80), threshold=5,
                         coverage=20, norm = "counts",
                         pheno_target=NULL, anno_target=NULL)

###--------------------------------------------------------------------- 
## Extracts sequences with 22 nt size and the samples in Batch1 and Batch2.
pac_subset <- PAC_filter(pac, subset_only = TRUE,
                        pheno_target=list("batch", c("Batch1", "Batch2")), 
                        anno_target=list("Size", "22"))

###--------------------------------------------------------------------- 
## Extracts all sequences with >=5 counts in 100% of samples a within stage
filtsep <- PAC_filtsep(pac, norm="counts", threshold=5, 
                      coverage=100, pheno_target= list("stage"))

pac_filt <- PAC_filter(pac, subset_only = TRUE,
                     anno_target= unique(do.call("c", as.list(filtsep))))




Danis102/seqpac documentation built on Aug. 26, 2023, 10:15 a.m.