packAlign: Global Alignment with VSEARCH

View source: R/packAlign.R

packAlignR Documentation

Global Alignment with VSEARCH

Description

A global pairwise alignment of pack-TYPE elements by sequence similarity. mIt may be useful to run packClust to identify groups of similar transposable elements, before generating alignments of each group.

Usage

packAlign(
  packMatches,
  Genome,
  identity = 0,
  threads = 1,
  identityDefinition = 2,
  maxWildcards = 0.05,
  saveFolder,
  vSearchPath = "vsearch"
)

Arguments

packMatches

A dataframe of potential Pack-TYPE transposable elements, in the format given by packSearch. This dataframe is in the format produced by coercing a link[GenomicRanges:GRanges-class]{GRanges} object to a dataframe: data.frame(GRanges). Will be saved as a FASTA file for VSEARCH.

Genome

A DNAStringSet object containing sequences referred to in packMatches (the object originally used to predict the transposons packSearch).

identity

The sequence identity of two transposable elements in packMatches required to be grouped into a cluster.

threads

The number of threads to be used by VSEARCH.

identityDefinition

The pairwise identity definition used by VSEARCH. Defaults to 2, the standard VSEARCH definition.

maxWildcards

The maximal allowable proportion of wildcards in the sequence of each match (defaults to 0.05).

saveFolder

The folder to save saveFolder files (uc, blast6out, FASTA)

vSearchPath

When the package is run on windows systems, the location of the VSEARCH executable file must be given; this should be left as default on Linux/MacOS systems.

Value

Saves alignment information, including a uc, blast6out and a pairwise alignment fasta file, to the specified location. Returns the uc summary file generated by the alignment.

Note

In order to align sequences using VSEARCH, the executable file must first be installed.

Author(s)

Jack Gisby

References

VSEARCH may be downloaded from https://github.com/torognes/vsearch, along with a manual documenting the program's parameters. See https://www.ncbi.nlm.nih.gov/pubmed/27781170 for further information.

See Also

tirClust, packClust, readBlast, readUc, filterWildcards, packSearch

Examples

data(arabidopsisThalianaRefseq)
data(packMatches)

# packAlign run on a Linux/MacOS system
## Not run: 
    packAlign(packMatches, Genome)

## End(Not run)

# packAlign run on a Windows system
## Not run: 
    packAlign(packMatches, Genome, 
            vSearchPath = "path/to/vsearch/vsearch.exe")

## End(Not run)


jackgisby/packFinder documentation built on July 19, 2022, 2:25 a.m.