packBlast | R Documentation |
Run BLAST against user-specified databases of non-transposon and transposon-relates proteins. Can be used to classify transposons based on their internal sequences.
packBlast( packMatches, Genome, blastPath, protDb, autoDb, minE = 0.001, blastTask = "blastn-short", maxHits = 100, threads = 1, saveFolder = NULL, tirCutoff = 100, autoCutoff = 1e-05, autoLength = 150, autoIdentity = 70, autoScope = NULL, protCutoff = 1e-05, protLength = 250, protIdentity = 70, protScope = 0.3 )
packMatches |
A dataframe of potential Pack-TYPE transposable elements,
in the format given by |
Genome |
A DNAStringSet object containing sequences referred to
in |
blastPath |
Path to the BLAST+ executable, or name of the BLAST+ application for Linux/MacOS users. |
protDb |
For assigning Pack-TYPE elements.
Path to the blast database containing nucleotide or protein
sequences to be matched against internal transposon
sequences. Can be generated
using BLAST+, or with
|
autoDb |
For assigning autonomous elements.
Path to the blast database containing nucleotide or protein
sequences to be matched against internal transposon
sequences. Can be generated
using BLAST+, or with
|
minE |
Blast results with e values greater than the specified cutoff will be ignored. This will be passed to BLASTN and applied to both transposon and non-transposon matches. |
blastTask |
Type of BLAST+ task, defaults to "blastn-short". |
maxHits |
Maximum hits returned by BLAST+ per query. |
threads |
Allowable number of threads to be utilised by BLAST+. |
saveFolder |
Directory to save BLAST+ results in; defaults to the working directory. |
tirCutoff |
How many bases to ignore at the terminal ends of the transposons to prevent hits to TIR sequences. |
autoCutoff |
Blast results for transposon-related elements will be filtered to ignore those with e values above the specified cutoff. |
autoLength |
Blast results for transposon-related elements containing hits with alignment lengths lower than this value will be ignored |
autoIdentity |
Blast results for transposon-related elements containing hits with sequence identities lower than this value will be ignored |
autoScope |
If specified, transposon-related blast results below the specified value will be ignored. Note that the dataframe of transposon matches must also be supplied to calculate scope. Scope is the proportion of the transposon's internal sequence occupied by the BLAST hit. |
protCutoff |
Blast results for genic/other matches will be filtered to ignore those with e values above the specified cutoff. |
protLength |
Blast results for genic/other matches containing hits with alignment lengths lower than this value will be ignored |
protIdentity |
Blast results for genic/other matches containing hits with sequence identities lower than this value will be ignored |
protScope |
If specified, genic/other blast matches below the specified value will be ignored. Note that the dataframe of transposon matches must also be supplied to calculate scope. Scope is the proportion of the transposon's internal sequence occupied by the BLAST hit. |
Returns the original packMatches
dataframe,
with the addition of a "classification" column
containing one of the following values:
auto - elements that match known transposases or transposon-related proteins are classified as autonomous elements
pack - elements that match other proteins or genic sequences may be classified as Pack-TYPE elements
other - elements that generate no significant hits
Jack Gisby
For further information, see the NCBI BLAST+ application documentation and help pages (https://www.ncbi.nlm.nih.gov/pubmed/20003500?dopt=Citation).
blastAnalysis
, packSearch
,
readBlast
, blastAnnotate
## Not run: packMatches <- data(packMatches) Genome <- data(arabidopsisThalianaRefseq) packBlast(packMatches, Genome, protDb = "C:/data/TAIR10_CDS", autoDb = "C:/data/TAIR10_transposons", blastPath = "C:/blast/bin/blastn.exe") ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.