assemble_gene_set | R Documentation |
Sometimes genes may be repeated within a genome, such as chloroplast genes in the inverted repeat. If 'drop_dups' is set to TRUE, these will be excluded from the results (with a warning). This is useful for assembling chloroplast gene matrices of single-copy genes.
assemble_gene_set(accessions, genes, parallel = FALSE,
drop_dups = TRUE)
accessions |
Vector of genbank accession numbers of (partial) genomes including the genes of interest |
genes |
Vector of gene names to assemble |
parallel |
Logical; should |
drop_dups |
Logical; should genes with duplicate copies be excluded from the results? |
When running in parallel ('parallel' option is set to TRUE),
it may be necessary to set the parallel backend first using
plan
, or the code will still run sequentially.
List. Each item in the list is a gene, which contains a list of sequences of class DNAbin.
## Not run:
# KP136830 is the GenBank accession no. for the Cystopteris protrusa plastome
# https://www.ncbi.nlm.nih.gov/nuccore/KP136830
# KP136830 is the GenBank accession no. for the Diplazium striatum plastome
# https://www.ncbi.nlm.nih.gov/nuccore/KY427346
# Assemble a list of DNA sequences for three genes from these two species.
# Note that psbA is duplicated since it is in the Inverted Repeat.
assemble_gene_set(
c("KP136830", "KY427346"),
c("accD", "atpA", "psbA", "not_a_proper_gene_name")
)
assemble_gene_set(
c("KP136830", "KY427346"),
c("accD", "atpA", "psbA", "not_a_proper_gene_name"),
drop_dups = FALSE
)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.