View source: R/alleles2genotypes.R
alleles2genotypes | R Documentation |
A function that will return a data frame in genotype per column format from a data frame in allele per column format.
alleles2genotypes(
df,
alleles,
name_sep = NULL,
name_pattern = NULL,
suffix = T,
Adelim = ""
)
df |
Data frame that contains genotypic data and metadata |
alleles |
Numeric column positions or names of loci |
name_sep |
The locus-allele seperator in column names (e.g., "_" in Loc1_A1,Loc1_A2...LocN_A1,LocN_A2). |
name_pattern |
The locus-allele identifiers in column names specified by 2 regex capture groups (e.g., "(.+)_(A.$) performs the same as name_sep="_"). This is useful when the locus-allele separator string is repeated in column names (e.g, Loc_1_A1). This argument takes precedent over name_sep. |
suffix |
Logical that allele identifier is a suffix (e.g., Loc1_A1 instead of A1_Loc1). Default=T. |
Adelim |
Desired delimiter string between alleles in output. The default="" is not ideal when allele character lengths vary. |
This function is meant to conveniently wrap tidyr and dplyr operations to convert data between two common wide formats. At the moment it works for diploid data when there is a common locus-allele identifier. If the input varies (e.g., Loc1.a1 Loc2_A1) you will want to gsub your way to uniformity or write your own solution.
Data frame in genotype per column format
Zak Robinson, Contact: <zrobinson@critfc.org>
genotypes2alleles
data("wgp_example_2col")
genotypes<-alleles2genotypes(wgp_example_2col,alleles = grep("_\\d$",colnames(wgp_example_2col)),name_pattern = "(.+)_(.$)",suffix = T,Adelim = ":")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.