pairwise_geno_id: Return every pair of individuals that mismatch at no more...
In eriqande/CKMRsim: Inference of Pairwise Relationships Using Likelihood Ratios

pairwise_geno_id

R Documentation

Return every pair of individuals that mismatch at no more than max_miss loci

Description

This is used for identifying duplicate individuals/genotypes in large data sets. I've specified this in terms of the max number of missing loci because I think everyone should already have tossed out individuals with a lot of missing data, and then it makes it easy to toss out pairs without even looking at all the loci, so it is faster for all the comparisons.

Usage

pairwise_geno_id(S, max_miss)

Arguments

`S`	"source", a matrix whose rows are integers, with NumInd-source rows and NumLoci columns, with each entry being a a base-0 representation of the genotype of the c-th locus at the r-th individual. These are the individuals you can think of as parents if there is directionality to the comparisons. Missing data is denoted by -1 (or any integer < 0).
`max_miss`	maximum allowable number of mismatching genotypes betwen the pairs.

Value

a data frame with columns:

ind1: the base-1 index in S of the first individual of the pair
ind2: the base-1 index in S of the second individual of the pair
num_mismatch: the number of loci at which the pair have mismatching genotypes
num_loc: the total number of loci missing in neither individual