dups | R Documentation |
Find indices of possible sample duplications between two aSnpStats objects
dups(x, y = NULL, tol = ncol(x)/50, type = c("hethom", "all"),
stopatone = TRUE)
x |
aSnpStats object |
y |
aSnpStats object |
tol |
maximum number of mismatched genotypes allowed for duplicate samples |
type |
by default, dups compares only homs vs hets, to allow for differently labelled alleles. Set type="all" to allow the two kinds of homozygote genotypes to count as a mismatch |
stopatone |
if TRUE, assume each sample in x can have at most one match in y, and vice versa. This makes things faster, and should be safe assuming x and y themselves contain no internal duplicates so is set to TRUE by default, but set it to FALSE if you want to catch multiple matches. |
Each pair of samples from x and y are compared in turn. If the number of mismatched and non-missing genotypes exceeds tol, the pair are assumed to be non-duplicates, and counting proceeds to the next pair. If the total number of mismatched and non-missing genotypes is <tol, then the indices of the sample pair are stored, and returned together with the number of mismatches and the number of non-missing genotypes compared.
a matrix, with four columns: index of dup in x, index of dup in y, number of mismatches, number of comparisons
Chris Wallace
## example data where samples 6:10 in x are the same as 1:5 in y
x <- example.data(1:10,1:500)
y <- example.data(6:15,1:500)
dups(x,y)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.