read_genepop_format: Reading data from Genepop.
In lfmcmillan/geneplot: Genetic Assignment and Plotting

View source: R/format_conversion.R

read_genepop_format

R Documentation

Reading data from Genepop.

Description

Import Genepop format file and convert to format compatible with GenePlot.

Usage

read_genepop_format(
  file_path,
  header = TRUE,
  diploid = TRUE,
  digits_per_allele = 2,
  pop_names = NULL
)

Arguments

`file_path`	String definining path of the file to read, which can have any extension.
`header`	(default TRUE) Boolean, indicate whether there is an additional descriptive line at the top of the Genepop-format file (TRUE) or whether the first line is the start of the locus names (FALSE).
`diploid`	(default TRUE) Boolean, indicates whether data is diploid (TRUE) or haploid (FALSE).
`digits_per_allele`	(default 2) Indicates whether data uses 2 or 3 digits per allele.
`pop_names`	(default NULL) Character vector (optional). Define the names of the populations, in the order that they appear in the file.

Details

File should have locus names at the top of the file, either as single-line list with commas, or as one name per line. Locus names are assumed to stop at the line before the first instance of POP.

Population names: By default, Genepop format does not include population names. If the individuals in a pop within the file do not have unique IDs then by default the whole population will be given the ID of the first individual as the population name and then the individuals in that population will be given auto-generated unique IDs. Otherwise, if the individuals in the population do have unique IDs then the populations will be named Pop1, Pop2 etc. according to the order in which they appear in the file. If there is a mixture in the file, then pops with unique ID individuals will be named Popx where x is their position in the file, and pops with non-unique ID individuals will be given the ID of their first individual as their popname. Use pop_names to define the pop names at the point of reading in the file.

Value

A list containing the following components: #'

locnames: Character vector of the locus names.
pop_data: The data, in a data frame, with two columns labelled as 'id' and 'pop', and with two additional columns per locus, labelled in the format Loc1.a1, Loc1.a2, Loc2.a1, Loc2.a2, etc.