duplicates_filter: Duplicates filtering.
In LabTranslationalArchitectomics/riboWaltz: Optimization of ribosome P-site positioning in ribosome profiling data

duplicates_filter

R Documentation

Duplicates filtering.

Description

This function provides multiple options for remove duplicated reads: when two or more reads are marked as duplicates, all of them are discarded but one.

Usage

duplicates_filter(
  data,
  sample = NULL,
  extremity = "both",
  keep = "shortest",
  output_class = "datatable",
  txt = FALSE,
  txt_file = NULL
)

Arguments

`data`	Either list of data tables or GRangesList object from `bamtolist`, `bedtolist` or `length_filter`.
`sample`	Character string or character string vector specifying the name of the sample(s) to process. Default is NULL i.e. all samples are processed.
`extremity`	Either "both", "5end", "3end". It specifies the criterion to define which reads should be considered duplicates. Reads are marked as duplicates if they map on the same transcript and share: both the 5' estremity and the 3' extremity ("both"), only the 5' extremity ("5end"), only the same 3' extremity ("3end "). For "5end" and "3end", reads of different lengths can be marked as duplicates. See `keep` to choose which one should be kept.
`keep`	Either "shortest" or "longest". It specifies wheter to keep the shortest or the longest read when duplicates display different lengths. This parameter is considered only if `extremity` is set to "5end" or "3end". Default is "shortest".
`output_class`	Either "datatable" or "granges". It specifies the format of the output i.e. a list of data tables or a GRangesList object. Default is "datatable".
`txt`	Logical value whether to write in a txt file statistics on the filtering step. Similar information are displayed by default in the console. Default is FALSE.
`txt_file`	Character string specifying the path, name and extension (e.g. "PATH/NAME.extension") of the plain text file where statistics on the filtering step shuold be written. If the specified folder doesn't exist, it is automatically created. If NULL (the default), the information are written in "duplicates_filtering.txt", saved in the working directory. This parameter is considered only if `txt` is TRUE.

Value

A list of data tables or a GRangesList object.

Examples

#generate an \emph{ad hoc} dataset:
library(data.table)
dt <- data.table(transcript = rep("ENSMUST00000000001.4", 6),
                 end5 = c(92, 92, 92, 94, 94, 95),
                 end3 = c(119, 119, 122, 122, 123, 123)
                 )[, length := end3 - end5 + 1
                   ][, cds_start := 14
                    ][, cds_stop := 1206]
example_reads_list <- list()
example_reads_list[["Samp_example"]] <- dt

## Reads are duplicates if they share both the 5' estremity and the
## 3' extremity:
filtered_list <- duplicates_filter(example_reads_list,
                                   extremity = "both")

## Reads are duplicates if they only share the 5' estremity. Among duplicated 
## reads we keep the shortes one:
filtered_list <- duplicates_filter(example_reads_list,
                                   extremity = "5end",
                                   keep = "shortest")

LabTranslationalArchitectomics/riboWaltz documentation built on Feb. 25, 2025, 10:17 p.m.

LabTranslationalArchitectomics/riboWaltz index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

LabTranslationalArchitectomics/riboWaltz
Optimization of ribosome P-site positioning in ribosome profiling data

duplicates_filter: Duplicates filtering.
In LabTranslationalArchitectomics/riboWaltz: Optimization of ribosome P-site positioning in ribosome profiling data

Duplicates filtering.

Description

Usage

Arguments

Value

Examples

Related to duplicates_filter in LabTranslationalArchitectomics/riboWaltz...

R Package Documentation

Browse R Packages

We want your feedback!

LabTranslationalArchitectomics/riboWaltz Optimization of ribosome P-site positioning in ribosome profiling data

duplicates_filter: Duplicates filtering. In LabTranslationalArchitectomics/riboWaltz: Optimization of ribosome P-site positioning in ribosome profiling data

Duplicates filtering.

Description

Usage

Arguments

Value

Examples

Related to duplicates_filter in LabTranslationalArchitectomics/riboWaltz...

R Package Documentation

Browse R Packages

We want your feedback!

LabTranslationalArchitectomics/riboWaltz
Optimization of ribosome P-site positioning in ribosome profiling data

duplicates_filter: Duplicates filtering.
In LabTranslationalArchitectomics/riboWaltz: Optimization of ribosome P-site positioning in ribosome profiling data