find_matching_samples: Find matching samples to a single representative

View source: R/find_matching_samples.R

find_matching_samplesR Documentation

Find matching samples to a single representative

Description

Given a specification of pairs of genotypes that must be from the same individual, this identifies connected components and returns a tibble with one column, indiv, and another column, aliases which is a list column, that includes, for each indiv, all the other names it is known by.

Usage

find_matching_samples(genotypes, return_clusters = TRUE, ...)

Arguments

genotypes

A tibble like coho_genotypes that has

...

parameters to be passed to rubias::close_matching_samples. Intended to be used for min_frac_non_miss and min_frac_matching

return_cluster

Set to TRUE by default, but you might not want to do this if you have a very permissive cutoff. It makes a graph of the pairs and finds the connected components.

Value

Returns a list with three components as follows:

  • pairs: A tibble holding the matching pairs that were found. It has the following columns:

    • num_non_miss: number of loci missing in neither member of the pair

    • num_match: number of non-missing loci having the same genotype in in each member of the pair.

    • indiv_1: the ID of the first member of the pair.

    • indiv_2:

  • clusters:

  • aliases:

Examples

# There are not actually any matching samples in coho_genotypes
# but we will just create some pairs that match by cranking
# the min_frac_matching down to 80%
find_matching_samples(coho_genotypes, min_frac_matching = 0.80)

eriqande/HatcheryPedAgree documentation built on Sept. 21, 2023, 7:24 p.m.