Given individual genotypes of two SNPs, the function checks if the two sets of genotypes are identical or completely opposite. A margin of error of 0.5% is allowed.

1 | ```
is.identical(x, y, allow = .005)
``` |

`x, y` |
two column vectors in the genotypes array, created by |

`allow` |
allowed margin of error |

Test if two SNPs are identical in genotypes across a set of individuals. SNPs are considered identical if the number of different genotypes in the population tested remains below an allowed error margin of 0.5%.

In addition to identical SNPs, the function considers SNP genotypes that are entirely opposite within error margin as redundant. Thus, two SNPs are declared highly correlated if the genotypes are all the same (0-0, 1-1, and 2-2) or are all opposite (0-2, 1-1, 2-0) within the error margin specified.

A logical value, of TRUE for identical SNPs or FALSE for different SNPs, is returned.

`GetHCS`

, `snpRecode`

, `toArray`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | ```
## Simulate random allele designations for 100 bi-allelic SNPs
set.seed(2016)
desig <- array(sample(c('A','C','G','T'), size = 200, repl = TRUE), dim=c(100, 2))
## Simulate random SNP genotypes for 20 individuals - put them in array format
## '-' indicates an unknown base
ga <- array(0, dim=c(20, 100))
for(i in 1:20)
for(j in 1:100)
ga[i, j] <- paste(sample(c(desig[j,],"-"), 2, prob=c(.47, .47, .06), repl=TRUE), collapse='')
## Recode the matrix, place recoded genotypes in ga.r
desig <- data.frame(AlleleA_Forward = factor(desig[,1]), AlleleB_Forward = factor(desig[,2]))
ga.r <- array(5, dim=c(20, 100))
for(i in 1:100) ga.r[,i] <- snpRecode(ga[,i], desig[i,])
## Check the first 2 SNPs for being identical based
## on a minimum allowed margin of error of 0.5%
is.identical(ga.r[,1], ga.r[,2], allow = .005)
# [1] FALSE
## Create an instance of exactly identical SNP genotypes
ga.r <- cbind(ga.r, ga.r[,1]) # SNP #1 and #101 are exactly identical
is.identical(ga.r[,1], ga.r[,101], allow = 0)
# [1] TRUE
## Create an instance of identical SNP genotypes with a 5% error
ga.r <- cbind(ga.r, ga.r[,1]) # SNP #1 and #101 are 100% identical
ga.r[20,101] <- 2 # a different genotype, to make SNP #1 & #101 only 95% identical
is.identical(ga.r[,1], ga.r[,101]) # use default allow of .005
# [1] FALSE
is.identical(ga.r[,1], ga.r[,101], allow = .05) # allow for a 5% marging of error
# [1] TRUE
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.