get.binary.snps: Reduce a genetic data matrix to only necessary columns.

View source: R/utils.R

get.binary.snpsR Documentation

Reduce a genetic data matrix to only necessary columns.

Description

Function to reduce a genetic data matrix containing multiple columns per locus to one column for each binary locus and N columns for each N-allelic non-binary locus.

Usage

get.binary.snps(snps, force = FALSE)

Arguments

snps

A genetic data matrix.

Details

This funtion identifies the number of alleles at each locus by assuming that the allele of each column is contained in the last two characters of each column name. We recommend that the columns of snps be labelled using the following four suffixes: ".a", ".c", ".g", ".t" (e.g., "Locus_123243.a", "Locus_123243.g"). If you are using an alternative naming convention, but the allele is also always being denoted using the last two characters (e.g., "Locus_123243_1", "Locus_123243_2"), the function will still work if you set the argument force = TRUE. Please also be careful not to accidentally remove any purposeful duplications with repeated names; for example, if you have deliberately duplicated unique columns (e.g., by expanding according to an index returned by ClonalFrameML).

Author(s)

Caitlin Collins caitiecollins@gmail.com


caitiecollins/treeWAS documentation built on March 9, 2024, 3:15 p.m.