Description Usage Arguments Details Value See Also Examples
Given individual genotypes of a set of SNPs, the function checks for the existence of redundant SNPs to exclude. A margin of error of 0.5% is allowed by default.
1 |
ga.r |
Matrix of genotypes created by |
Exclude |
Numeric indices or column names of |
allow |
allowed margin of error, default is 0.005. |
Test for similar SNP genotypes across a set of individuals. SNPs are considered
identical if the number of different genotypes in the population tested remains below
an allowed error margin of 0.5%. Say, Exclude <- 1:100
with SNP #1 similar to #25,
then Exclude[25]
will be flagged for exclusion, whereas Exclude[1]
will not be flagged for exclusion.
In addition to identical SNPs, the function flaggs SNP genotypes that are entirely opposite within error margin as redundant as well. Thus, SNPs are declared highly correlated if the genotypes are all the same (0-0, 1-1, and 2-2) or all opposite (0-2, 1-1, 2-0) within the error margin specified.
If Exclude
contains SNP names, a character
vector of excluded SNPs is returned, and if it contains integer values, a numeric vector
of excluded SNPs is returned.
is.identical
, snpRecode
, toArray
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ## Simulate random allele designations for 100 bi-allelic SNPs
set.seed(2016)
desig <- array(sample(c('A','C','G','T'), size = 200, repl = TRUE), dim=c(100, 2))
## Simulate random SNP genotypes for 20 individuals - put them in array format
## '-' indicates an unknown base
ga <- array(0, dim=c(20, 100))
for(i in 1:20)
for(j in 1:100)
ga[i, j] <- paste(sample(c(desig[j,],"-"), 2, prob=c(.47, .47, .06), repl=TRUE), collapse='')
## Recode the matrix, place recoded genotypes in ga.r
desig <- data.frame(AlleleA_Forward = factor(desig[,1]), AlleleB_Forward = factor(desig[,2]))
ga.r <- array(5, dim=c(20, 100))
for(i in 1:100) ga.r[,i] <- snpRecode(ga[,i], desig[i,])
## Check all SNP genotypes in ga.r for similarity across individuals
## Allow for a margin of error of 0.5%
GetHCS(ga.r)
#[1] 42 91 # SNPs 42 & 91 are similar to earlier SNPs in the vector, 'Exclude'
## Check SNP genotypes from 1 to 50 for similarity across individuals
GetHCS(ga.r, Exclude=1:50)
#[1] 42
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.