read.accession2taxid: Read NCBI accession2taxid files

View source: R/taxa.R

read.accession2taxidR Documentation

Read NCBI accession2taxid files

Description

Take NCBI accession2taxid files, keep only accession and taxa and save it as a SQLite database

Usage

read.accession2taxid(
  taxaFiles,
  sqlFile,
  vocal = TRUE,
  extraSqlCommand = "",
  indexTaxa = FALSE,
  overwrite = FALSE
)

Arguments

taxaFiles

a string or vector of strings giving the path(s) to files to be read in

sqlFile

a string giving the path where the output SQLite file should be saved

vocal

if TRUE output status messages

extraSqlCommand

for advanced use. A string giving a command to be called on the SQLite database before loading data. A couple potential uses:

  • "PRAGMA temp_store_directory = '/MY/TMP/DIR'" to store SQLite temporary files in directory /MY/TMP/DIR. Useful if the temporary directory used by SQLite (which is not necessarily in the same location as R's) is small on your system

  • "pragma temp_store = 2;" to keep all SQLite temp files in memory. Don't do this unless you have a lot (>100 Gb) of RAM

indexTaxa

if TRUE add an index for taxa ID. This would only be necessary if you want to look up accessions by taxa ID e.g. getAccessions

overwrite

If TRUE, delete accessionTaxa table in database if present and regenerate

Value

TRUE if sucessful

References

https://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/

See Also

read.nodes.sql, read.names.sql

Examples

taxa<-c(
  "accession\taccession.version\ttaxid\tgi",
  "Z17427\tZ17427.1\t3702\t16569",
  "Z17428\tZ17428.1\t3702\t16570",
  "Z17429\tZ17429.1\t3702\t16571",
  "Z17430\tZ17430.1\t3702\t16572"
)
inFile<-tempfile()
sqlFile<-tempfile()
writeLines(taxa,inFile)
read.accession2taxid(inFile,sqlFile,vocal=FALSE)
db<-RSQLite::dbConnect(RSQLite::SQLite(),dbname=sqlFile)
RSQLite::dbGetQuery(db,'SELECT * FROM accessionTaxa')
RSQLite::dbDisconnect(db)

taxonomizr documentation built on May 29, 2024, 8:49 a.m.