Auxiliary functions for genlight objects

Share:

Description

These functions provide facilities for usual computations using genlight objects. When ploidy varies across individuals, the outputs of these functions depend on whether the information units are individuals, or alleles within individuals (see details).

These functions are:

- glSum: computes the sum of the number of second allele in each SNP.

- glNA: computes the number of missing values in each SNP.

- glMean: computes the mean number of second allele in each SNP.

- glVar: computes the variance of the number of second allele in each SNP.

- glDotProd: computes dot products between (possibly centred/scaled) vectors of individuals - uses compiled C code - used by glPca.

Usage

1
2
3
4
5
6
glSum(x, alleleAsUnit = TRUE, useC = FALSE)
glNA(x, alleleAsUnit = TRUE)
glMean(x, alleleAsUnit = TRUE)
glVar(x, alleleAsUnit = TRUE)
glDotProd(x, center = FALSE, scale = FALSE, alleleAsUnit = FALSE,
                parallel = require("parallel"), n.cores = NULL)

Arguments

x

a genlight object

alleleAsUnit

a logical indicating whether alleles are considered as units (i.e., a diploid genotype equals two samples, a triploid, three, etc.) or whether individuals are considered as units of information.

center

a logical indicating whether SNPs should be centred to mean zero.

scale

a logical indicating whether SNPs should be scaled to unit variance.

useC

a logical indicating whether compiled C code should be used (TRUE) or not (FALSE, default).

parallel

a logical indicating whether multiple cores -if available- should be used for the computations (TRUE, default), or not (FALSE); requires the package parallel to be installed (see details); this option cannot be used alongside useCoption.

n.cores

if parallel is TRUE, the number of cores to be used in the computations; if NULL, then the maximum number of cores available on the computer is used.

Details

=== On the unit of information ===

In the cases where individuals can have different ploidy, computation of sums, means, etc. of allelic data depends on what we consider as a unit of information.

To estimate e.g. allele frequencies, unit of information can be considered as the allele, so that a diploid genotype contains two samples, a triploid individual, three samples, etc. In such a case, all computations are done directly on the number of alleles. This corresponds to alleleAsUnit = TRUE.

However, when the focus is put on studying differences/similarities between individuals, the unit of information is the individual, and all genotypes possess the same information no matter what their ploidy is. In this case, computations are made after standardizing individual genotypes to relative allele frequencies. This corresponds to alleleAsUnit = FALSE.

Note that when all individuals have the same ploidy, this distinction does not hold any more.

Value

A numeric vector containing the requested information.

Author(s)

Thibaut Jombart t.jombart@imperial.ac.uk

See Also

- genlight: class of object for storing massive binary SNP data.

- dapc: Discriminant Analysis of Principal Components.

- glPca: PCA for genlight objects.

- glSim: a simple simulator for genlight objects.

- glPlot: plotting genlight objects.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## Not run: 
x <- new("genlight", list(c(0,0,1,1,0), c(1,1,1,0,0,1), c(2,1,1,1,1,NA)))
x
as.matrix(x)
ploidy(x)

## compute statistics - allele as unit ##
glNA(x)
glSum(x)
glMean(x)

## compute statistics - individual as unit ##
glNA(x, FALSE)
glSum(x, FALSE)
glMean(x, FALSE)

## explanation: data are taken as relative frequencies
temp <- as.matrix(x)/ploidy(x)
apply(temp,2, function(e) sum(is.na(e))) # NAs
apply(temp,2,sum, na.rm=TRUE) # sum
apply(temp,2,mean, na.rm=TRUE) # mean

## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.