prepare_bold_res: Prepare raw BOLD specimen and DNA sequences dataset

Description Usage Arguments Value

Description

Filter and mutate BOLD dataset to produce a curated data.frame with rows as individual specimen and columns as specimen information. It adds a new column sequence with fasta sequences as string.

Usage

1
2
3
4
5
6
7
8
9
prepare_bold_res(
  bold_res,
  marker_code = "",
  species_names = TRUE,
  coordinates = TRUE,
  ambiguities = TRUE,
  min_length = 0,
  max_length = 800
)

Arguments

bold_res

A list of lists returned by bold_seqspec command from bold package. They are 2 lists: * Specimen information (spatial coordinates, taxonomy...) * DNA barcode sequences

marker_code

If not empty, only specimen with field data$markercode matching marker_code names are kept. By default, markercode filtering is not applied.

species_names

a logical value indicating whether specimen with no species name information should be removed.

coordinates

a logical value indicating whether specimen with no latitude or longitude spatial coordinates information should be removed.

ambiguities

a logical value indicating whether specimen with DNA sequence containing IUPAC ambiguities should be removed.

min_length

numeric. Minimum length of a DNA sequence. Default value: 0 bp.

max_length

numeric. Maximum length of a DNA sequence. Default value: 800 bp

Value

a data.frame with same fields than resBold$data$markercode and a supplementary column with DNA sequences as string.


Grelot/rgeogendiv documentation built on Dec. 22, 2020, 5:51 a.m.