packFinder: packFinder: a package for the de novo Annotation of Pack-TYPE...

packFinderR Documentation

packFinder: a package for the de novo Annotation of Pack-TYPE Transposable Elements

Description

Algorithm and tools for in silico pack-TYPE transposon discovery. Filters a given genome for properties unique to DNA transposons and provides tools for the investigation of returned matches.

Main Algorithm

The goal of packFinder was to implement a simple tool for the prediction of potential Pack-TYPE elements. packFinder uses the following prior knowledge, provided by the user, to detect transposons:

  • Terminal Inverted Repeat (TIR) Base Sequence

  • Length of Terminal Site Duplication (TSD)

  • Length of the Transposon

These features provide enough information to detect autonomous and pack-TYPE elements. For a transposon to be predicted by packFinder its TSD sequences must be identical to each other, its forward TIR sequence must match the base sequence provided and its reverse TIR sequence must match its reverse complement.

Transposons are therefore predicted by searching a given genome for these characteristics, and further analysis steps can reveal the nature of these elements - while the packFinder tool is sensitive for the detection of transposons, it does not discriminate between autonomous and Pack-TYPE elements. Autonomous elements will contain a transposase gene within the terminal inverted repeats and tend to be larger than their Pack-TYPE counterparts; pack-TYPE elements instead capture sections of host genomes. Following cluster analysis, BLAST can be used to discern which predicted elements are autonomous (transposase-containing) and with are true Pack-TYPE elements.

Workflow

An example of a standard workflow can be found using browseVignettes(package = "packFinder"). The primary functions include:

  • packSearch - the packSearch algorithm uses simple pattern matching to detect DNA transposons.

  • packClust - VSEARCH is used for clustering elements based on sequence similarity.

Having obtained the sequences of transposable elements in a given genome, it is recommende to carry out a BLAST search for each transposon cluster. This can identify which elements are likely autonomous, and which may be Pack-TYPE.

The packFinder functions report the position of elements in a given genome using a dataframe in the format of packMatches. This dataframe is in the format produced by coercing a link[GenomicRanges:GRanges-class]{GRanges} object to a dataframe: data.frame(GRanges).

Author(s)

Jack Gisby

See Also

packSearch


jackgisby/packFinder documentation built on July 19, 2022, 2:25 a.m.