length_filter: Read length filtering.

View source: R/length_filter.R

length_filterR Documentation

Read length filtering.

Description

This function provides multiple options for filtering the reads according to their length. Read lengths to keep are either specified by the user or automatichally selected on the basis of the trinucleotide periodicity of reads mapping on the CDS.

Usage

length_filter(
  data,
  sample = NULL,
  length_filter_mode = "periodicity",
  periodicity_threshold = 50,
  length_range = NULL,
  output_class = "datatable",
  txt = FALSE,
  txt_file = NULL
)

Arguments

data

Either list of data tables or GRangesList object from bamtolist, bedtolist or duplicates_filter.

sample

Character string or character string vector specifying the name of the sample(s) to process. Default is NULL i.e. all samples are processed.

length_filter_mode

Either "periodicity" or "custom". It specifies how read length selection should be performed. "periodicity": only read lengths satisfying a periodicity threshold (see periodicity_threshold) are kept. It ensures the removal of all reads with low or no periodicity; "custom": only read lengths specified by the user are kept (see length_range). Default is "periodicity".

periodicity_threshold

Integer in 10, 100. Only read lengths satisfying this threshold (i.e. a higher percentage of read extremities falls in one of the three reading frames along the CDS) are kept. This parameter is considered only if length_filter_mode is set to "periodicity". Default is 50.

length_range

Integer or integer vector specifying one read length or a range of read lengths to keep, respectively. This parameter is considered only if length_filter_mode is set to "custom".

output_class

Either "datatable" or "granges". It specifies the format of the output i.e. a list of data tables or a GRangesList object. Default is "datatable".

txt

Logical value whether to write in a txt file statistics on the filtering step. Similar information are displayed by default in the console. Default is FALSE.

txt_file

Character string specifying the path, name and extension (e.g. "PATH/NAME.extension") of the plain text file where statistics on the filtering step shuold be written. If the specified folder doesn't exist, it is automatically created. If NULL (the default), the information are written in "length_filtering.txt", saved in the working directory. This parameter is considered only if txt is TRUE.

Value

A list of data tables or a GRangesList object.

Examples

data(reads_list)

## Keep reads of length between 27 and 30 nucleotides (included):
filtered_list <- length_filter(reads_list, length_filter_mode = "custom",
                               length_range = 27:30)

## Keep reads of lengths satisfying a periodicity threshold (70%):
filtered_list <- length_filter(reads_list, length_filter_mode = "periodicity",
                               periodicity_threshold = 70)

LabTranslationalArchitectomics/riboWaltz documentation built on Jan. 17, 2024, 12:18 p.m.