Description Usage Arguments Value Column names Programming notes Examples
The BLAST file must originate from blastn with the follwing output format option:
-outfmt"6 qseqid sseqid sacc stitle sscinames staxids sskingdoms sblastnames pident slen length mismatch gapopen qstart qend sstart send evalue bitscore"
It is very important that the columns are in this precise order and no column is missing.
For the formating options see:
What does the function do :
Group all GenBank accession
Obtain taxonomy from GenBank (note the GenBank taxonomy is now in the PR2 database after downloading from ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/new_taxdump/)
Merge back into the BLAST file
Compute a summary with best hit,
The summary file contains several set of columns
The top hit (column with prefix hit_top_)
1 | blast_summary(file_name)
|
file_name |
The name of the BLAST file with full path |
TRUE if the function has been successful.
The summary table is saved by changing the name of the file by replacing the extension by _summary.tsv.
The columns for the Blast are named as follows. For the summary a prefix is added
1 2 3 4 5 6 7 8 9 | query_id, hit_id, hit_acc, hit_title, hit_sci_names
hit_tax_ids, hit_super_kingdoms, hit_blast_names,
pct_identity, hit_length, alignment_length, mismatches,
gap_opens, query_start, query_end, hit_start, hit_end,
evalue, bit_score
|
The following functions must be used with libary qualifier dplyr:: because they are also in the plyr library
ungroup
desc
rename
Uses the local version of the PR2 database for faster access (much faster !!)
1 | blast_reformat("C:/BLAST_output.txt")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.