prepare_bold_res: Prepare raw BOLD specimen + sequence dataset

Description Usage Arguments Value Author(s) Examples

Description

Filter and mutate BOLD dataset to produce a curated dataframe with rows as individual specimen and columns as specimen information. It adds a new column sequence with fasta sequences as string.

Usage

1
2
prepare_bold_res(bold_res, marker_code = "", species_names = TRUE,
coordinates = TRUE, ambiguities = TRUE, min_length = 0, max_length = 800)

Arguments

bold_res

A list of lists returned by bold_seqspec command from bold package. They are 2 lists:

  • data Specimen information (spatial coordinates, taxonomy...)

  • fasta DNA barcode sequences.

marker_code

(character) If not empty, only specimen with field data$markercode matching marker_code names are kept. By default, markercode filtering is not applied.

species_names

a logical value indicating whether specimen with no species name information should be removed.

coordinates

a logical value indicating whether specimen with no latitude or longitude spatial coordinates information should be removed.

ambiguities

a logical value indicating whether specimen with DNA sequence containing IUPAC ambiguities should be removed.

min_length

numeric. Minimum length of a DNA sequence. Default value: 0 bp.

max_length

numeric. Maximum length of a DNA sequence. Default value: 800 bp

Value

a data.frame with same fields than resBold$data$markercode + a new column sequence with fasta sequences as string.

Author(s)

Pierre-Edouard GUERIN, Stephanie MANEL

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## load BOLD specimen and sequence data with matching taxa "Pomacanthidae"
data(requestPomacanthidaeBOLD)
## filter and mutate
prparedResBold <- prepare_bold_res(resBold,
                                   marker_code="COI-5P",
                                   species_names=TRUE, 
                                   coordinates=TRUE, 
                                   ambiguities=TRUE, 
                                   min_length=420,
                                   max_length=720
                                  )

Grelot/geogendivr documentation built on Sept. 3, 2020, 6:25 p.m.