View source: R/auto_seq_download.R
auto_seq_download | R Documentation |
Takes a list of genera, as supplied by the user, and searches and downloads molecular sequence data from BOLD and Genbank.
auto_seq_download( BOLD_database = TRUE, NCBI_database = TRUE, search_str = NULL, input_file = NULL, output_file = NULL, seq_min = 100, seq_max = 2500 )
BOLD_database |
TRUE is to include, FALSE is to exclude; default TRUE |
NCBI_database |
TRUE is to include, FALSE is to exclude; default TRUE |
search_str |
NULL uses the default string, anything other than NULL then that string will be used for the GenBank search; default NULL. The Default String is: (genus[ORGN]) NOT (shotgun[ALL] OR genome[ALL] OR assembled[ALL] OR microsatellite[ALL]) |
input_file |
NULL prompts the user to indicate the location of the input file through point and click prompts, anything other than NULL then the string supplied will be used for the location; default NULL |
output_file |
NULL prompts the user to indicate the location of the output file through point and click prompts, anything other than NULL then the string supplied will be used for the location; default NULL |
seq_min |
holds the minimum length value to not flag the sequence; default 100 |
seq_max |
holds the maximum length value to not flag the sequence; default 2500 |
User Input: A list of genera in a text file in a single column with a new line at the end of the list.
Outputs: One main folder containing three other folders. Main folder - Seq_auto_dl_TTTTTT_MMM_DD Three subfolders: 1. BOLD - Contains a file for every genus downloaded with the raw data from the BOLD system. 2. NCBI - Contains a file for every genus downloaded with the raw data from GenBank. 3. Total_tables - Contains files for the running of the function which include... A_Summary.txt - This file contains information about the downloads. A_Total_Table.tsv - A file with a single table containing the accumulated data for all genera searched.
When using a custom search string for NCBI only a single genus at a time can be used.
Robert G. Young
<https://github.com/rgyoung6/MACER> Young, R. G., Gill, R., Gillis, D., Hanner, R. H. (Submitted June 2021). Molecular Acquisition, Cleaning, and Evaluation in R (MACER) - A tool to assemble molecular marker datasets from BOLD and GenBank. Biodiversity Data Journal.
create_fastas() align_to_ref() barcode_clean()
## Not run: auto_seq_download() auto_seq_download(BOLD_database = TRUE, NCBI_database = FALSE) auto_seq_download(BOLD_database = FALSE, NCBI_database = TRUE) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.