mat_gen_dist: Compute a pairwise matrix of genetic distances between...

View source: R/mat_gen_dist.R

mat_gen_distR Documentation

Compute a pairwise matrix of genetic distances between populations

Description

The function computes a pairwise matrix of genetic distances between populations and allows to implement several formula.

Usage

mat_gen_dist(x, dist = "basic", null_val = FALSE)

Arguments

x

An object of class genind that contains the multilocus genotypes (format 'locus') of the individuals as well as their populations.

dist

A character string indicating the method used to compute the multilocus genetic distance between populations

  • If 'dist = 'basic” (default), then the multilocus genetic distance is computed using a formula of Euclidean genetic distance (Excoffier et al., 1992)

  • If 'dist = 'weight”, then the multilocus genetic distance is computed as in Fortuna et al. (2009). It is a Euclidean genetic distance giving more weight to rare alleles

  • If 'dist = 'PG”, then the multilocus genetic distance is computed as in popgraph::popgraph function, following several steps of PCA and SVD (Dyer et Nason, 2004).

  • If 'dist = 'DPS”, then the genetic distance used is equal to 1 - the proportion of shared alleles (Bowcock, 1994)

  • If 'dist = 'FST”, then the genetic distance used is the pairwise FST (Weir et Cockerham, 1984)

  • If 'dist = 'FST_lin”, then the genetic distance used is the linearised pairwise FST (Weir et Cockerham, 1984)(FST_lin = FST/(1-FST))

  • If 'dist = 'PCA”, then the genetic distance is computed following a PCA of the matrix of allelic frequencies by population. It is a Euclidean genetic distance between populations in the multidimensional space defined by all the independent principal components.

  • If 'dist = 'GST”, then the genetic distance used is the G'ST (Hedrick, 2005). See graph4lg <= 1.6.0 only, because it used diveRsity

  • If 'dist = 'D”, then the genetic distance used is Jost's D (Jost, 2008). See graph4lg <= 1.6.0 only, because it used diveRsity

null_val

(optional) Logical. Should negative and null FST, FST_lin, GST or D values be replaced by half the minimum positive value? This option allows to compute Gabriel graphs from these "distances". Default is null_val = FALSE. This option only works if 'dist = 'FST” or 'FST_lin' or 'GST' or 'D'

Details

Negative values are converted into 0. Euclidean genetic distance d_{ij} between population i and j is computed as follows:

d_{ij}^{2} = ∑_{k=1}^{n} (x_{ki} - x_{kj})^{2}

where x_{ki} is the allelic frequency of allele k in population i and n is the total number of alleles. Note that when 'dist = 'weight”, the formula becomes

d_{ij}^{2} = ∑_{k=1}^{n} (1/(K*p_{k}))(x_{ki} - x_{kj})^{2}

where K is the number of alleles at the locus of the allele k and p_{k} is the frequency of the allele k in all populations. Note that when 'dist = 'PCA”, n is the number of conserved independent principal components and x_{ki} is the value taken by the principal component k in population i.

Value

An object of class matrix

Author(s)

P. Savary

References

\insertRef

bowcock1994highgraph4lg \insertRefexcoffier1992analysisgraph4lg \insertRefdyer2004populationgraph4lg \insertReffortuna2009networksgraph4lg \insertRefweir1984estimatinggraph4lg \insertRefhedrick2005standardizedgraph4lg \insertRefjost2008gstgraph4lg

Examples

data(data_ex_genind)
x <- data_ex_genind
D <- mat_gen_dist(x = x, dist = "basic")

graph4lg documentation built on Feb. 16, 2023, 5:43 p.m.