minePlastome: Read and download targeted loci from plastome sequences in...

View source: R/minePlastome.R

minePlastomeR Documentation

Read and download targeted loci from plastome sequences in GenBank

Description

A function built on the genbankr package, designed to establish a connection with the GenBank database. This function reads plastome sequences using provided accession numbers, extracting and formatting any specified targeted loci, and finally writing them in a fasta file format.

Usage

minePlastome(genbank = NULL,
             taxon = NULL,
             voucher = NULL,
             CDS = TRUE,
             genes = NULL,
             verbose = TRUE,
             dir = "RESULTS_minePlastome")

Arguments

genbank

A vector comprising the GenBank accession numbers specifically corresponding to the plastome sequence targeted for locus mining.

taxon

A vector containing the taxon name linked to the plastome sequence. In the absence of this information, the function will default to the existing nomenclature linked to the plastome, as originally provided in GenBank.

voucher

A vector containing relevant voucher information linked to the plastome sequence. If this information is supplied, the function will promptly append it immediately following the taxon name of the downloaded targeted sequence.

CDS

a logical controlling whether the targeted loci are protein coding genes, otherwise the function understands that entered gene names are e.g. intron or intergenic spacer regions.

genes

A vector of one or more gene names as annotated in GenBank.

verbose

Logical, if FALSE, a message showing each step during the GenBank search will not be printed in the console in full.

dir

Pathway to the computer's directory, where the mined DNA sequences in a fasta format file will be saved. The default is to create a directory named RESULTS_minePlastome and the sequences will be saved within a subfolder named after the current date.

Value

A fasta format file of DNA sequences saved on disk.

Author(s)

Domingos Cardoso

Examples

## Not run: 
library(catGenes)
library(dplyr)

data(GenBank_accessions)

GenBank_plastomes <- GenBank_accessions %>%
  filter(!is.na(Plastome)) %>%
  select(c("Species", "Voucher", "Plastome"))

minePlastome(genbank = GenBank_plastomes$Plastome,
             taxon = GenBank_plastomes$Species,
             voucher = GenBank_plastomes$Voucher,
             CDS = TRUE,
             genes = c("matK", "rbcL"),
             verbose = TRUE,
             dir = "RESULTS_minePlastome")

## End(Not run)


domingoscardoso/catGenes documentation built on March 14, 2024, 9:21 p.m.