readGenepop: A function to calculate allele frequencies from genepop...

Description Usage Arguments Details Value Author(s) References Examples

Description

readGenepop allows the calculation of various parameters from 3 digit and 2 digit genepop files. The purpose of the function is mainly as a data manipulation process to allow for easy downstream analysis.

Usage

1
readGenepop(infile = NULL, gp = 3, bootstrap = FALSE)

Arguments

infile

Specifies the name of the ‘genepop’(Rousset, 2008) file from which the statistics are to be calculated. This file can be in either the 3 digit of 2 digit format, and must contain only one whitespace separator (e.g. “space" or “tab") between each column including the individual names column. The number of columns must be equal to the number of loci + 1 (the individual names column). If this file is not in the working directory the file path must be given. The name must be a character string (i.e. enclosed in “" or '').

gp

A numeric argument specifying the format of the infile. Either ‘3’ or ‘2’ are accepted as arguments. Default is gp = 3.

bootstrap

A logical argument specifying whether the user would like the infile data bootstrapped. If bootstrap = TRUE a genepop format object is returned. See bootstrap_file in the value section below.

Details

Results from this function allow for the calculation of various population genetics statistics, such as those calculated by div.part and in.calc. Users may find it useful for data exploration. For instance by employing the plot {graphics} function, an 'ad hoc' assessment of allele size distribution can be carried out using the code in the example section below. From this example it is clear that the function will be particularly useful for those wishing to develop their own novel analysis methods.

Value

npops

The number of population samples in infile.

nloci

The number of loci in infile.

pop_alleles

A list of matrices (n = 2 x npops) containing haploid allele designations. Every two matrices contain the two alleles per individual per population. For example pop_alleles[[1]][1,1] and pop_alleles[[2]][1,1] are the two alleles observed in individual ‘1’ in population ‘1’ at locus ‘1’, whereas pop_alleles[[3]][1,1] and pop_alleles[[4]][1,1] are the two alleles observed in individual ‘1’ in population ‘2’ at locus ‘1’.

pop_list

A list of matrices (n = npops) containing the diploid genotypes of individuals per locus.

loci_lames

A character vector containing the names of loci from infile.

pop_pos

A numeric vector or the row index locations of the first individual per population in infile.

pop_sizes

A numeric vector of length npops containing the number of individuals per population sample in infile.

allele_names

A list of npops lists containing nloci character vectors of alleles names per locus. Useful for identifying unique alleles.

all_alleles

A list of nloci character vectors of all alleles observed across all population samples in infile.

allele_freq

A list containing nloci matrices containing allele frequencies per alleles per population sample.

raw_data

An unaltered data frame of infile.

loci_harm_N

A numeric vector of length nloci, containing the harmonic mean number of individuals genotyped per locus.

n_harmonic

A numeric value representing the harmonic mean of npops.

pop_names

A character vector containing a four letter population sample name for each population in infile (the first four letter of the first individual).

indtyp

A list of length nloci containing character vectors of length npops, indicating the number of individuals per population sample typed at each locus.

nalleles

A vector of the total number of alleles observed at each locus.

bs_file

A genepop format data frame of bootstrapped infile. This value is only returned if bootstrap = TRUE.

Author(s)

Kevin Keenan <kkeenan02@qub.ac.uk>

References

Rousset, F., “genepop'007: a complete re-implementation of the genepop software for Windows and Linux.,” Molecular ecology resources, vol. 8, no. 1, pp. 103-6, (2008).

Examples

1
2
3
4
5
6
7
# Code to plot ordered allele fragment sizes to assess mutation model.
data(Test_data, package = "diveRsity") # define data
x <- readGenepop(infile = Test_data, gp = 3, bootstrap = FALSE)
locus10_pop1 <- c(x$pop_alleles[[1]][[2]][,10],
                  x$pop_alleles[[1]][[2]][,10])
sort_order <- order(locus10_pop1, decreasing = FALSE) #sort alleles
plot(locus10_pop1[sort_order], col="red", ylab = "Allele size")

Example output



diveRsity documentation built on May 1, 2019, 10:30 p.m.