filter_host_bowtie: Align reads against one or more filter libraries and...

View source: R/filter_host.R

filter_host_bowtieR Documentation

Align reads against one or more filter libraries and subsequently remove mapped reads


After a sample is aligned to a target library with align_target_bowtie(), we may use filter_host_bowtie() to remove unwelcome host contamination using filter reference libraries. This function takes as input the name of the .bam file produced via align_target_bowtie(), and produces a sorted .bam file with any reads that match the filter libraries removed. This resulting .bam file may be used downstream for further analysis.


  make_bam = FALSE,
  output = paste(tools::file_path_sans_ext(reads_bam), "filtered", sep = "."),
  bowtie2_options = NULL,
  YS_1 = 1e+06,
  YS_2 = 1e+05,
  threads = 8,
  overwrite = FALSE



The name of a merged, sorted .bam file that has previously been aligned to a reference library. Likely, the output from running an instance of align_target_bowtie().


Path to the directory that contains the filter Bowtie2 index files.


The basename of the filter libraries (without .bt2 or .bt2l extension)


Logical, whether to also output a bam file with host reads filtered out. An rds file will be created instead if FALSE. Default is FALSE.


The desired name of the output .bam or .rds file. Extension is automatically defined by whether make_bam = TRUE. Default is the basename of unfiltered_bam + .filtered + extension.


Optional: Additional parameters that can be passed to the filter_host_bowtie() function. To see all the available parameters use Rbowtie2::bowtie2_usage(). Default parameters are the parameters are the default parameters that PathoScope 2.0 uses. NOTE: Users should pass all their parameters as one string and if optional parameters are given then the user is responsible for entering all the parameters to be used by Bowtie2. NOTE: The only parameters that should NOT be specified here is the threads.


yieldSize, an integer. The number of alignments to be read in from the bam file at once for the creation of an intermediate fastq file. Default is 1000000.


yieldSize, am integer. The number of alignments to be read in from the bam file at once for the creation of a filtered bam file. Smaller chunks are generally needed for this step, which is why it is a better idea to keep 'YS_2' smaller than 'YS_1' to conserve memory. Default is 100000.


The amount of threads available for the function. Default is 8 threads.


Whether existing files should be overwritten. Default is FALSE.


Alternatively, an RDS data frame can be output for a smaller output file that is created more efficiently (through parallelization) and is still compatible with metascope_id().


The name of a filtered, sorted .bam file written to the user's current working directory. Or, if make_bam = FALSE, an RDS file containing a data frame of only requisite information to run metascope_id().


#### Filter reads from bam file that align to any of the filter libraries

## Assuming a bam file has already been created with align_target_bowtie()

## Create a temporary directory to store the filter library
ref_temp <- tempfile()

## Create a temporary directory to store the filter library index files
lib_temp <- tempfile()

## Create a temporary directory to store the filtered bam file
align_temp <- tempfile()

## Create object with path to previously created bam file
bamPath <- system.file("extdata", "bowtie_target.bam", package = "MetaScope")

## Create object with path to the filter library
refPath <- system.file("extdata","filter.fasta", package = "MetaScope")

## Move the filter library to the temporary reference directory
file.copy(from = refPath, to = file.path(ref_temp, "filter.fasta"))

## Create the bowtie index files in the temporary index directory
mk_bowtie_index(ref_dir = ref_temp, lib_dir = lib_temp, lib_name = "filter",

## Filter reads from the bam file that align to the filter library
filter_host_bowtie(reads_bam = bamPath, lib_dir = lib_temp,
                   libs = "filter", threads = 1)

compbiomed/MetaScope documentation built on Aug. 9, 2022, 10:41 a.m.