remove_matches: Helper function to remove reads matched to filter libraries

View source: R/filter_host.R

Within the filter_host() function, we align our sequencing sample to all filter libraries of interest. The remove_matches() function allows for removal of any target reads that are also aligned to filter libraries.


remove_matches(reads_bam, read_names, output, YS, threads, aligner, make_bam)



The name of a merged, sorted .bam file that has previously been aligned to a reference library. Likely, the output from running an instance of align_target().


A list of target query names from reads_bam that have also aligned to a filter reference library. Each list element should be a vector of read names.


The name of the .bam or .rds file that to which the filtered alignments will be written.


yieldSize, an integer. The number of alignments to be read in from the bam file at once for the creation of an intermediate fastq file. Default is 1000000.


The number of threads to be used in filtering the bam file.


The aligner which was used to create the bam file.


Logical, whether to also output a bam file with host reads filtered out. An rds file will be created instead if FALSE. Default is FALSE.


This function is not intended for direct use.


Depending on input make_bam, either the name of a filtered, sorted .bam file written to the user's current working directory, or an RDS file containing a data frame of only requisite information to run metascope_id().


#readPath <- system.file("extdata", "subread_target.bam",
#                        package = "MetaScope")

## Assume that the first 10 query names aligned to first filter library
## And another 10 aligned to second filter library
# qnames <- Rsamtools::scanBam(readPath)[[1]]$qname
# read_names <- list(qnames[1:10], qnames[30:40])
# out <- "subread_target.filtered.bam"

# remove_matches_bam(readPath, read_names, out, YS = 1000, threads = 1,
#                    aligner = "subread", make_bam = FALSE)

