checkDNAbcd: Evaluation of a reference library of aligned DNA barcodes

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

This function provides an overview of the content of a reference library of aligned DNA barcodes (Sonet et al. 2013). It calculates all pairwise distances and delivers an output that can be used by the function adhocTHR.

Usage

1
checkDNAbcd(seq, DistModel = "K80")

Arguments

seq

an object of class "DNAbin".

DistModel

"K80" (for Kimura two-parameter) or "raw" (for p-distances) or any other nucleotide substitution model available in the function "dist.dna" (Paradis et al. 2004).

Details

Sequence labels of "seq" should have the following structure: ">species_name_any_additional_information" as in the following example (note that character strings have to be separated by underscores): ">Bactrocera_amplexa_Kenya_voucher1052_JEMU".

Value

checkDNAbcd returns a list of 6 components:

mylabels

a data.frame providing both parts of the species names and the complete label of each sequence (as extracted from the first argument).

listsp

a data.frame listing the number of sequences (Nseq) and haplotypes (Nhap) for each species of the reference library.

DNAlength

a numeric vector of the sequence lengths of each DNA sequence.

dist

a matrix of all distances obtained by pairwise comparison.

spdist

a list of all pairwise interspecific distances (inter) and all pairwise intraspecific distances (intra).

seq

an object of class "DNAbin" with all sequences in the reference library (= first argument).

Author(s)

Gontran Sonet

References

Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20: 289-290.

Sonet G, Jordaens K, Nagy ZT, Breman FC, De Meyer M, Backeljau T & Virgilio M", "(2013) Adhoc: an R package to calculate ad hoc distance thresholds for DNA barcoding identification, Zookeys, 365:329-336. http://zookeys.pensoft.net/articles.php?id=3057.

See Also

adhocTHR

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
data(tephdata);
out1<-checkDNAbcd(tephdata);

#Plot distribution of sequence lengths
hist(out1$DNAlength,main="Seq. lengths",xlab="Seq. length (bp)");

#Plot distribution of pairwise interspecific distances
hist(out1$spdist$inter,main="Intersp. dist",xlab="Distance",col="#0000ff99");

#Plot distribution of pairwise intraspecific distances
hist(out1$spdist$intra, main="Intrasp. dist.",xlab="Distance",col="#0000ff22");

#Plot distribution of both pairwise intra- and interspecific distances
hist(out1$spdist$inter,main="Intra- & intersp. dist",xlab="Distance",col="#0000ff99");
hist(out1$spdist$intra, add=TRUE,col="#0000ff22");

#Idem as previous example with zoom on intraspecific values
hist(out1$spdist$intra,main="Zoom intra- & intersp. dist",xlab="Distance",col="#0000ff99");
hist(out1$spdist$inter, add=TRUE,col="#0000ff22");

adhoc documentation built on May 2, 2019, 2:36 a.m.