Description Usage Arguments Value
This function evaluates the taxonomical structure from a set of input IDs and returns the count of each taxon ID with non-zero occurrences in the taxonomy graph, as well as the count of known species under each taxon.
1 2 3 4 5 6 7 8 9 |
taxonomy_path |
path to folder containing the NCBI taxonomy files (i.e., the extracted contents of _taxdump.zip_, which can be downloaded from <ftp://ftp.ncbi.nih.gov/pub/taxonomy/> or retrieved using [retrieve_NCBI_taxonomy()]). |
ids_file |
path to a tab-separated file containng two two columns, with the input taxon IDs in the first column, and the corresponding sequence IDs (corresponding to the ID strings int the multifasta input file, without the ">" line starter) in the second column. |
ids_df |
two-column data frame with the input taxon IDs in the first column, and the corresponding sequence IDs (corresponding to the ID strings int the multifasta input file, without the ">" line starter) in the second column. Ignored if 'ids_file' is not 'NULL'. |
spp_file |
path to a tab-separated file containing two two columns, with the input taxon IDs in the first column, and the corresponding number of known species in the second column. If both 'spp_file' and 'spp_df' are 'NULL', then the counts are computed internally based either on the taxonomy files under 'taxonomy_path'. |
spp_df |
two-column data frame with the input taxon IDs in the first column, and the corresponding number of known species in the second column. This can be generated using [get_taxID_spp_counts()]. Ignored if 'spp_file' is not 'NULL'. |
start_from_species |
logical, passed down to [get_taxID_spp_counts()] (only if 'spp_file' and 'spp_file' are 'NULL'). |
verbose |
logical: regulates function echoing to console. |
list object of class _taxonsampling_, containing:
'$ids_df': data.frame with taxon IDs in column 1 and corresponding sequence IDs in column 2, as loaded from 'ids_file' or passed directly as input. Filtered to maintain only IDs that exist in '$nodes' and to remove duplicated IDs.
'$nodes': data.frame containing the pre-processed information about the NCBI taxonomy structure, extracted from file _nodes.dmp_ of the taxonomy files. Filtered to keep only nodes with IDs present in '$ids_df' and with a non-zero total count.
'$countIDs': numeric vector with the counts of the number of taxonomy nodes (of all levels) under each taxon ID.
'spp_df': two-column data frame with the input taxon IDs in the first column, and the corresponding number of known species in the second column. Filtered to have only the IDs present in '$countIDs'.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.