preprocessStrandTable – remove low quality libraries and contigs before attempting to build a genome

Share:

Description

preprocessStrandTable – remove low quality libraries and contigs before attempting to build a genome

Usage

1
2
3
4
5
## S4 method for signature 'StrandFreqMatrix'
preprocessStrandTable(strandTable,
  strandTableThreshold = 0.8, filterThreshold = 0.8,
  orderMethod = "libsAndConc", lowQualThreshold = 0.9, verbose = TRUE,
  minLib = 10)

Arguments

strandTable

data.frame containing the strand table to use as input

strandTableThreshold

threshold at which to call a contig WW or CC rather than WC

filterThreshold

maximum number of libraries a contig can be NA or WC in

orderMethod

the method to oder contigs. currently libsAndConc only option. Set to FALSE to not order contigs based on library quality

lowQualThreshold

background threshold at which to toss an entire library. If NULL, function will not make an overall assessment of library quality. Very chimeric assemblies can appear low quality across all libraries.

verbose

messages written to terminal

minLib

minimum number of libraries a contig must be present in to be included in the output

Value

A list of one matrix and three quality data.frames – 1: a matrix of WW/WC/WW calls for all contigs; 3: the quality of libraries used (based on frequencies outside expected ranges); 4: A data.frame of libraries that are of low quality and therefore excluded from analysis; 5: contigs that are present as WC in more libraries than expected. These are excluded from the strandStateMatrix, but are potentially worth investigating for chimerism.

Examples

1
2
3
4
5
data("exampleStrandFreq")

strandStates <- preprocessStrandTable(exampleStrandFreq, lowQualThreshold=0.8)

show(strandStates[[1]]) # WW-WC-CC matrix

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.