freqs2pairwisePi: Calculate pairwise pi from allele frequency data

View source: R/format.data.R

freqs2pairwisePiR Documentation

Calculate pairwise pi from allele frequency data

Description

freqs2pairwisePi calculates pairwise pi between all samples included in an allele frequency data matrix

Usage

freqs2pairwisePi(freqs, nLoci = NULL, quiet = FALSE)

Arguments

freqs

A matrix of allele frequencies with one column per locus and one row per sample. Missing data should be indicated with NA.

nLoci

An integer giving the total number of loci in the dataset. If left as NULL, it defaults to the number of columns in the freqs argument.

quiet

An Boolean (TRUE/FALSE) indicating whether to suppress printing a progress update bar. Default is FALSE.

Details

This function takes an allele frequency data matrix for a sample of diploid individuals and returns pairwise pi between all samples, which can then be used in a BEDASSLE analysis.

This function calculates pairwise pi (the proportion of sites at which each pair of samples differs, out of the total number of loci in the dataset) between a set of diploid individuals from a matrix of allele frequency data. The matrix of pairwise pi that is returned can then be used to run a bedassle analysis with run.bedassle.

Pairwise pi is calculated as the proportion of sites at which a pair of individuals differs out of the total number of loci in the dataset. If it is calculated using only loci that are polymorphic in the global dataset, it is called pi at polymorphic sites. If it is calculated using all genotyped base-pairs, it is simply pi. Either statistic can be converted to allelic covariance and used for running BEDASSLE. If the freqs matrix specified consists only of polymorphic loci, but the user wishes to calculate pi (rather than pi at polymorphic sites), she must specify the total number of loci in the dataset (polymorphic and invariant) using the nLoci command.

Missing data is handled in a pairwise fashion in the calculation of pi for each pair of individuals. That is, for each pair of individuals, the function goes through each locus at which they were both genotyped and calculates the number of sites at which they differ, then divides that total by (nLoci - Mij), where Mij is the number of loci at which either individual in the comparison is missing data.

Value

This function returns the pairwise pi matrix that can then be used to run a BEDASSLE analysis run with run.bedassle.


gbradburd/bedassle documentation built on May 20, 2022, 1 p.m.