getAccession2taxid: Download accession2taxid files from NCBI

View source: R/taxa.R

getAccession2taxidR Documentation

Download accession2taxid files from NCBI

Description

Download a nucl_xxx.accession2taxid.gz from NCBI servers. These can then be used to create a SQLite datanase with read.accession2taxid. Note that if the files already exist in the target directory then this function will not redownload them. Delete the files if a fresh download is desired.

Usage

getAccession2taxid(
  outDir = ".",
  baseUrl = sprintf("%s://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/", protocol),
  types = c("nucl_gb", "nucl_wgs"),
  protocol = "ftp"
)

Arguments

outDir

the directory to put the accession2taxid.gz files in

baseUrl

the url of the directory where accession2taxid.gz files are located

types

the types if accession2taxid.gz files desired where type is the prefix of xxx.accession2taxid.gz. The default is to download all nucl_ accessions. For protein accessions, try types=c('prot').

protocol

the protocol to be used for downloading. Probably either 'http' or 'ftp'. Overridden if baseUrl is provided directly

Value

a vector of file path strings of the locations of the output files

References

https://ftp.ncbi.nih.gov/pub/taxonomy/, https://www.ncbi.nlm.nih.gov/genbank/acc_prefix/

See Also

read.accession2taxid

Examples

## Not run: 
  if(readline(
    "This will download a lot data and take a while to process.
     Make sure you have space and bandwidth. Type y to continue: "
  )!='y')
    stop('This is a stop to make sure no one downloads a bunch of data unintentionally')

  getAccession2taxid()

## End(Not run)

taxonomizr documentation built on Feb. 16, 2023, 6:25 p.m.