pairwise_geno_id | R Documentation |
This is used for identifying duplicate individuals/genotypes in large data sets. I've specified this in terms of the max number of missing loci because I think everyone should already have tossed out individuals with a lot of missing data, and then it makes it easy to toss out pairs without even looking at all the loci, so it is faster for all the comparisons.
pairwise_geno_id(S, max_miss)
S |
"source", a matrix whose rows are integers, with NumInd-source rows and NumLoci columns, with each entry being a a base-0 representation of the genotype of the c-th locus at the r-th individual. These are the individuals you can think of as parents if there is directionality to the comparisons. Missing data is denoted by -1 (or any integer < 0). |
max_miss |
maximum allowable number of mismatching genotypes betwen the pairs. |
a data frame with columns:
the base-1 index in S of the first individual of the pair
the base-1 index in S of the second individual of the pair
the number of loci at which the pair have mismatching genotypes
the total number of loci missing in neither individual
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.