download_refseq: Download RefSeq genome libraries

View source: R/download_refseq.R

download_refseqR Documentation

Download RefSeq genome libraries

Description

This function will automatically download RefSeq genome libraries in a .fasta format from the specified taxon. The function will first download the summary report at: ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/**kingdom**/assembly_summary.txt, and then use this file to download the genome(s) and combine them in a single compressed or uncompressed .fasta file.

Usage

download_refseq(
  taxon,
  reference = TRUE,
  representative = FALSE,
  compress = TRUE,
  patho_out = FALSE
)

Arguments

taxon

Select one taxon to download. The taxon name should be a recognized NCBI scientific name, with no grammatical or capitalization inconsistencies. All available taxonomies are visible by accessing the taxonomy_table object included in the package.

reference

Download only RefSeq reference genomes? Defaults to TRUE. Automatically set to TRUE if representative = TRUE.

representative

Download only RefSeq representative genomes? Defaults to FALSE. If TRUE, reference is automatically set at TRUE.

compress

Compress the output .fasta file? Defaults to TRUE.

patho_out

Create duplicate outpute files compatible with PathoScope? Defaults to FALSE.

Value

Returns a .fasta or .fasta.gz file of the desired RefSeq genomes. This file is named after the kingdom selected and saved to the current directory (e.g. 'bacteria.fasta.gz'). Currently, this function also returns a .fasta file formatted for PathoScope as well (e.g. bacteria.pathoscope.fasta.gz') if path_out = TRUE.

Examples

#### Download RefSeq genomes

## Download all RefSeq reference bacterial superkingdom genomes
download_refseq('bacteria', reference = TRUE, representative = FALSE)

## Download all RefSeq representative mononegavirales genomes
download_refseq('mononegavirales', representative = TRUE)

## Download all RefSeq morbillivirus genomes
download_refseq('morbillivirus', reference = FALSE)

## Download all RefSeq bacilli reference genomes, uncompressed
download_refseq('Bacilli', reference = TRUE,
                representative = FALSE, compress = FALSE)

## Download RefSeq Escherichia coli IAI1 strain
download_refseq('Escherichia coli IAI1', reference = FALSE, compress = FALSE)


compbiomed/MetaScope documentation built on Aug. 9, 2022, 10:41 a.m.