OneCodeToFindThemAll: OneCodeToFindThemAll parser of RepeatMasker annotations

View source: R/annotations.R

OneCodeToFindThemAllR Documentation

OneCodeToFindThemAll parser of RepeatMasker annotations

Description

OneCodeToFindThemAll parser of RepeatMasker annotations

Usage

OneCodeToFindThemAll(
  gr,
  dictionary = NULL,
  fuzzy = FALSE,
  strict = FALSE,
  insert = -1,
  BPPARAM = SerialParam(progressbar = TRUE)
)

Arguments

gr

A GRanges object with RepeatMasker annotations from AnnotationHub

dictionary

(Default NULL) When NULL, a dictionary is built based on names of repeats. If not, a data.frame with equivalences LTR - internal regions created by the user, where first column should be the name of the internal region and the second column should be the LTR(s). When more than one LTR, these should be separated by ":".

fuzzy

(Default FALSE) A logical; if TRUE, the search for equivalences between internal parts and LTRs to reconstruct LTR class transposable elements is less stringent, allowing more matches between corresponding subparts. This option can increase the proportion of false positives (incorrectly reconstructed LTR class TEs).

strict

(Default FALSE) A logical; if TRUE, the 80-80 rule is applied, i.e. only copies with more than 80 and more than 80 bp long are reported.

insert

(Default -1) An integer. When insert < 0, two fragments are assembled if the distance separating their furthest extremities is less than twice the reference length of the element. When insert > 0, fragments are assembled if the distance between their closest extremities is equal or less than insert. When insert = 0, two fragments are assembled if they are in contact next to each other.

BPPARAM

See ?bplapply in the BiocParallel package. Can be used to run calculations in parallel.

Details

Implementation of One code to find them all (Bailly-Bechet et al. 2014). Parses RepeatMasker annotations from UCSC by assembling together fragments from the same transposable elemenet (TE) that are close enough (determined by the insert parameter). For TEs from the LTR class, the parser tries to reconstruct full-length, when possible, or partial TEs following the LTR - internal region - LTR structure. Equivalences between internal regions and flanking LTRs can be set by the user with the dictionary parameter or can be obtained by the parser. In this last case, the fuzzy parameter determines the level of stringency when searching for LTR - internal region equivalences.

Value

A GRangesList object.

References

Bailly-Bechet et al. "One code to find them all": a perl tool to conveniently parse RepeatMasker output files. Mobile DNA. 2014;5(1):1-15. DOI: https://doi.org/10.1186/1759-8753-5-13

Examples

## Not run: 
rmskoc <- annotaTEs(genome="dm6", parsefun=OneCodeToFindThemAll,
                    fuzzy=FALSE, strict=FALSE)

## End(Not run)


functionalgenomics/atena documentation built on Nov. 4, 2024, 7:33 p.m.