gl.report.pa: Reports private alleles (and fixed alleles) per pair of...

View source: R/gl.report.pa.r

gl.report.paR Documentation

Reports private alleles (and fixed alleles) per pair of populations

Description

This function reports private alleles in one population compared with a second population, for all populations taken pairwise. It also reports a count of fixed allelic differences and the mean absolute allele frequency differences (AFD) between pairs of populations.

Usage

gl.report.pa(
  x,
  x2 = NULL,
  method = "pairwise",
  loc_names = FALSE,
  plot.out = TRUE,
  font_plot = 14,
  map.interactive = FALSE,
  palette_discrete = discrete_palette,
  save2tmp = FALSE,
  verbose = NULL
)

Arguments

x

Name of the genlight object containing the SNP data [required].

x2

If two separate genlight objects are to be compared this can be provided here, but they must have the same number of SNPs [default NULL].

method

Method to calculate private alleles: 'pairwise' comparison or compare each population against the rest 'one2rest' [default 'pairwise'].

loc_names

Whether names of loci with private alleles and fixed differences should reported. If TRUE, loci names are reported using a list [default FALSE].

plot.out

Specify if Sankey plot is to be produced [default TRUE].

font_plot

Numeric font size in pixels for the node text labels [default 14].

map.interactive

Specify whether an interactive map showing private alleles between populations is to be produced [default FALSE].

palette_discrete

A discrete palette for the color of populations or a list with as many colors as there are populations in the dataset [default discrete_palette].

save2tmp

If TRUE, saves any ggplots and listings to the session temporary directory (tempdir) [default FALSE].

verbose

Verbosity: 0, silent, fatal errors only; 1, flag function begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2 or as specified using gl.set.verbosity].

Details

Note that the number of paired alleles between two populations is not a symmetric dissimilarity measure.

If no x2 is provided, the function uses the pop(gl) hierarchy to determine pairs of populations, otherwise it runs a single comparison between x and x2.

Hint: in case you want to run comparisons between individuals (assuming individual names are unique), you can simply redefine your population names with your individual names, as below:

pop(gl) <- indNames(gl)

Definition of fixed and private alleles

The table below shows the possible cases of allele frequencies between two populations (0 = homozygote for Allele 1, x = both Alleles are present, 1 = homozygote for Allele 2).

  • p: cases where there is a private allele in pop1 compared to pop2 (but not vice versa)

  • f: cases where there is a fixed allele in pop1 (and pop2, as those cases are symmetric)

pop1
0 x 1
0 - p p,f
pop2 x - - -
1 p,f p -

The absolute allele frequency difference (AFD) in this function is a simple differentiation metric displaying intuitive properties which provides a valuable alternative to FST. For details about its properties and how it is calculated see Berner (2019).

The function also reports an estimation of the lower bound of the number of undetected private alleles using the Good-Turing frequency formula, originally developed for cryptography, which estimates in an ecological context the true frequencies of rare species in a single assemblage based on an incomplete sample of individuals. The approach is described in Chao et al. (2017). For this function, the equation 2c is used. This estimate is reported in the output table as Chao1 and Chao2.

In this function a Sankey Diagram is used to visualize patterns of private alleles between populations. This diagram allows to display flows (private alleles) between nodes (populations). Their links are represented with arcs that have a width proportional to the importance of the flow (number of private alleles).

if save2temp=TRUE, resultant plot(s) and the tabulation(s) are saved to the session's temporary directory.

Value

A data.frame. Each row shows, for each pair of populations the number of individuals in each population, the number of loci with fixed differences (same for both populations) in pop1 (compared to pop2) and vice versa. Same for private alleles and finally the absolute mean allele frequency difference between loci (AFD). If loc_names = TRUE, loci names with private alleles and fixed differences are reported in a list in addition to the dataframe.

Author(s)

Custodian: Bernd Gruber – Post to https://groups.google.com/d/forum/dartr

References

  • Berner, D. (2019). Allele frequency difference AFD – an intuitive alternative to FST for quantifying genetic population differentiation. Genes, 10(4), 308.

  • Chao, Anne, et al. "Deciphering the enigma of undetected species, phylogenetic, and functional diversity based on Good-Turing theory." Ecology 98.11 (2017): 2914-2929.

See Also

gl.list.reports, gl.print.reports

Other report functions: gl.report.bases(), gl.report.callrate(), gl.report.diversity(), gl.report.hamming(), gl.report.heterozygosity(), gl.report.hwe(), gl.report.ld.map(), gl.report.locmetric(), gl.report.maf(), gl.report.monomorphs(), gl.report.overshoot(), gl.report.parent.offspring(), gl.report.rdepth(), gl.report.reproducibility(), gl.report.secondaries(), gl.report.sexlinked(), gl.report.taglength()

Examples

out <- gl.report.pa(platypus.gl)

dartR documentation built on June 8, 2023, 6:48 a.m.