find_close_matching_genotypes: Return every pair of individuals that mismatch at no more...

View source: R/find_close_matching_genotypes.R

find_close_matching_genotypesR Documentation

Return every pair of individuals that mismatch at no more than max_miss loci

Description

This is used for identifying duplicate individuals/genotypes in large data sets. I've specified this in terms of the max number of missing loci because I think everyone should already have tossed out individuals with a lot of missing data, and then it makes it easy to toss out pairs without even looking at all the loci, so it is faster for all the comparisons.

Usage

find_close_matching_genotypes(LG, CK, max_mismatch)

Arguments

LG

a long genotypes data frame.

CK

a ckmr object created from the allele frequencies computed from LG.

max_mismatch

maximum allowable number of mismatching genotypes betwen the pairs.

Value

a data frame with columns:

indiv_1

the id (from the rownames in S) of the firt member of the pair

indiv_2

the id (from the rownames in S) of the second individual of the pair

num_mismatch

the number of loci at which the pair have mismatching genotypes

num_loc

the total number of loci missing in neither individual


eriqande/CKMRsim documentation built on Aug. 2, 2024, 7:23 a.m.