alleles2genotypes: Convert wider format Alleles to wide format Genotypes...

View source: R/alleles2genotypes.R

alleles2genotypesR Documentation

Convert wider format Alleles to wide format Genotypes (hierfstat format)

Description

A function that will return a data frame in genotype per column format from a data frame in allele per column format.

Usage

alleles2genotypes(
  df,
  alleles,
  name_sep = NULL,
  name_pattern = NULL,
  suffix = T,
  Adelim = ""
)

Arguments

df

Data frame that contains genotypic data and metadata

alleles

Numeric column positions or names of loci

name_sep

The locus-allele seperator in column names (e.g., "_" in Loc1_A1,Loc1_A2...LocN_A1,LocN_A2).

name_pattern

The locus-allele identifiers in column names specified by 2 regex capture groups (e.g., "(.+)_(A.$) performs the same as name_sep="_"). This is useful when the locus-allele separator string is repeated in column names (e.g, Loc_1_A1). This argument takes precedent over name_sep.

suffix

Logical that allele identifier is a suffix (e.g., Loc1_A1 instead of A1_Loc1). Default=T.

Adelim

Desired delimiter string between alleles in output. The default="" is not ideal when allele character lengths vary.

Details

This function is meant to conveniently wrap tidyr and dplyr operations to convert data between two common wide formats. At the moment it works for diploid data when there is a common locus-allele identifier. If the input varies (e.g., Loc1.a1 Loc2_A1) you will want to gsub your way to uniformity or write your own solution.

Value

Data frame in genotype per column format

Author(s)

Zak Robinson, Contact: <zrobinson@critfc.org>

See Also

genotypes2alleles

Examples

data("wgp_example_2col")
genotypes<-alleles2genotypes(wgp_example_2col,alleles = grep("_\\d$",colnames(wgp_example_2col)),name_pattern = "(.+)_(.$)",suffix = T,Adelim = ":")

zakrobinson/RLDNe documentation built on Oct. 24, 2024, 5:37 p.m.