read_tracks | R Documentation |
Convenience functions to read sequences, features or links from various
bioinformatics file formats, such as FASTA, GFF3, Genbank, BLAST tabular
output, etc. See def_formats()
for full list. File formats and the
corresponding read-functions are automatically determined based on file
extensions. All these functions can read multiple files in the same format at
once, and combine them into a single table - useful, for example, to read a
folder of gff-files with each file containing genes of a different genome.
read_feats(files, .id = "file_id", format = NULL, parser = NULL, ...)
read_subfeats(files, .id = "file_id", format = NULL, parser = NULL, ...)
read_links(files, .id = "file_id", format = NULL, parser = NULL, ...)
read_sublinks(files, .id = "file_id", format = NULL, parser = NULL, ...)
read_seqs(
files,
.id = "file_id",
format = NULL,
parser = NULL,
parse_desc = TRUE,
...
)
files |
files to reads. Should all be of same format. In many cases,
compressed files ( |
.id |
the column with the name of the file a record was read from. Defaults to "file_id". Set to "bin_id" if every file represents a different bin. |
format |
specify a format known to gggenomes, such as |
parser |
specify the name of an R function to overwrite automatic
determination based on format, e.g. |
... |
additional arguments passed on to the format-specific read function called down the line. |
parse_desc |
turn |
A gggenomes-compatible sequence, feature or link tibble
tibble with features
tibble with features
tibble with links
tibble with links
tibble with sequence information
read_feats()
: read files as features mapping onto
sequences.
read_subfeats()
: read files as subfeatures mapping onto other features
read_links()
: read files as links connecting sequences
read_sublinks()
: read files as sublinks connecting features
read_seqs()
: read sequence ID, description and length.
# read genes/features from a gff file
read_feats(ex("eden-utr.gff"))
# read all gff files from a directory
read_feats(list.files(ex("emales/"), "*.gff$", full.names = TRUE))
# read remote files
gbk_phages <- c(
PSSP7 = paste0(
"ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/",
"000/858/745/GCF_000858745.1_ViralProj15134/",
"GCF_000858745.1_ViralProj15134_genomic.gff.gz"
),
PSSP3 = paste0(
"ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/",
"000/904/555/GCF_000904555.1_ViralProj195517/",
"GCF_000904555.1_ViralProj195517_genomic.gff.gz"
)
)
read_feats(gbk_phages)
# read sequences from a fasta file.
read_seqs(ex("emales/emales.fna"), parse_desc = FALSE)
# read sequence info from a fasta file with `parse_desc=TRUE` (default). `key=value`
# pairs are removed from `seq_desc` and parsed into columns with `key` as name
read_seqs(ex("emales/emales.fna"))
# read sequence info from samtools/seqkit style index
read_seqs(ex("emales/emales.fna.seqkit.fai"))
# read sequence info from multiple gff file
read_seqs(c(ex("emales/emales.gff"), ex("emales/emales-tirs.gff")))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.