download_ncbi_genome_file: Download genome files from NCBI based on accession number

download_ncbi_genome_fileR Documentation

Download genome files from NCBI based on accession number

Description

This function downloads specific genomic files from NCBI's FTP server based on the provided accession number. It supports downloading different types of files, or the entire directory containing the files.

Usage

download_ncbi_genome_file(
  accession,
  out_dir = ".",
  type = "gff",
  file_suffix = NULL,
  timeout = 300
)

Arguments

accession

A character string representing the NCBI accession number (e.g., "GCF_001036115.1_ASM103611v1" or "GCF_001036115.1"). The accession can start with "GCF" or "GCA".

out_dir

A character string representing the directory where the downloaded files will be saved. Defaults to the current working directory (".").

type

A character string representing the type of file to download. Supported types are "all", "gff", "fna". If "all" is specified, the function will prompt the user to use command line tools to download the entire directory. Defaults to "gff".

file_suffix

A character string representing the specific file suffix to download. If specified, this will override the type parameter. Defaults to NULL.

timeout

A numeric value representing the maximum time in seconds to wait for the download. Defaults to 300.

Details

If the provided accession does not contain the version suffix (e.g., "GCF_001036115.1"), the function will query the NCBI FTP server to determine the full accession name.

When type is set to "all", the function cannot download the entire directory directly but provides a command line example for the user to download the directory using tools like wget.

Value

No value

Examples

## Not run: 
download_ncbi_genome_file("GCF_001036115.1", out_dir = "downloads", type = "gff")
download_ncbi_genome_file("GCF_001036115.1", out_dir = "downloads", file_suffix = "_genomic.fna.gz")

## End(Not run)


pcutils documentation built on June 26, 2024, 1:06 a.m.