SelectSingularBlastALN: Filter a blast result table for alignment overlapping other...

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/SelectSingularBlastALN.R

Description

Filter a blast result table imported with readBlast to remove alignments with large overlaps (e.g. > 50%) with other alignments

Usage

1
SelectSingularBlastALN(aln, rl, threshold = 0.5)

Arguments

aln

A tibble obtained with the readBlast function (containing strand info)

rl

A ReadLength table, i.e. a data frame with 2 columns: ReadName and ReadLength

threshold

Number in ]0,1]. Remove all alignments that overlap with other alignments on > threshold % of their length

Value

a character vector with the row numbers corresponding to alignments to keep (i.e. that don't overlap on >threshold

Author(s)

Pascal GP Martin

References

The code for this function is based on a suggestion by Michael Lawrence: https://support.bioconductor.org/p/72656/

See Also

readBlast

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Example dataset. The vector has a repetitive region that systematically give multiple alignments
  Path2OVL <- system.file("extdata", "BAC02_BlastVector.res", package = "NanoBAC")
  alignmt <- readBlast(Path2OVL)
  nrow(alignmt)
## Read lengths
  Path2ReadLength <- system.file("extdata", "BAC02_ReadLength.tsv", package = "NanoBAC")
  ReadLengthTable <- read.table(Path2ReadLength,
                                sep = "\t", header = FALSE,
                                stringsAsFactors = FALSE,
                                col.names = c("ReadName", "ReadLength"))
## Filter the data to keep alignments not overlapping on more than 50% of their length
  filtaln <- alignmt[SelectSingularBlastALN(alignmt, ReadLengthTable),]
  nrow(filtaln)  #many alignments removed

pgpmartin/NanoBAC documentation built on Dec. 11, 2020, 9:51 a.m.