NCBI-utils | R Documentation |
Low-level utility functions to access NCBI resources. Not intended to be used directly by the end user.
find_NCBI_assembly_ftp_dir(assembly_accession, assembly_name=NA)
fetch_assembly_report(assembly_accession, assembly_name=NA,
AssemblyUnits=NULL)
assembly_accession |
A single string containing either a GenBank assembly accession
(e.g. Alternatively, for |
assembly_name |
A single string or |
AssemblyUnits |
By default, all the assembly units are included in the data frame
returned by |
For find_NCBI_assembly_ftp_dir()
: A length-2 character vector:
The 1st element in the vector is the URL to the FTP dir, without the trailing slash.
The 2nd element in the vector is the prefix used in the names of most of the files in the FTP dir.
For fetch_assembly_report()
: A data frame with 1 row per sequence
in the assembly and 10 columns:
SequenceName
SequenceRole
AssignedMolecule
AssignedMoleculeLocationOrType
GenBankAccn
Relationship
RefSeqAccn
AssemblyUnit
SequenceLength
UCSCStyleName
fetch_assembly_report
is the workhorse behind higher-level
and more user-friendly getChromInfoFromNCBI
.
H. Pagès
getChromInfoFromNCBI
for a higher-level and
more user-friendly version of fetch_assembly_report
.
ftp_dir <- find_NCBI_assembly_ftp_dir("GCA_000001405.15")
ftp_dir
url <- ftp_dir[[1]] # URL to the FTP dir
prefix <- ftp_dir[[2]] # prefix used in names of most files
list_ftp_dir(url)
assembly_report_url <- paste0(url, "/", prefix, "_assembly_report.txt")
## To fetch the assembly report for assembly GCA_000001405.15, you can
## call fetch_assembly_report() on the assembly accession or directly
## on the URL to the assembly report:
assembly_report <- fetch_assembly_report("GCA_000001405.15")
dim(assembly_report)
head(assembly_report)
## Sanity check:
assembly_report2 <- fetch_assembly_report(assembly_report_url)
stopifnot(identical(assembly_report, assembly_report2))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.