Description Usage Arguments Value Author(s) References See Also Examples
Main function to detect mislabeled samples using perturbation strategy
1 2 3 |
phenotype |
phenotype data: a nTrait-by-nSample matrix |
genotype |
genotype data: a nMarker-by-nSample matrix with two allels being 0 and 1 (or A and B) or three allels being 0, 0.5 and 1 (or, A, H, and B), where 0.5 (or H) represents heterozygous allele. |
fileName |
output file name. If NULL (default) it produces files starting with "test" |
thres |
probability threshold to decide if a sample is mislabled based on permutation result (Default=0.9). |
optGT |
recovered optimal genotype from the given phenotype |
optGTplot |
If TRUE it produces a plot of the genotype with two colors: green and red color indicate the original genotype of a sample (column) at certain marker (row) is correct or correct, respectively. |
optGT.thres |
threshold to decide if thr original genotype is correct |
permu |
If TRUE permutation is performed to estimate the likelihood of each sample being mislabled. |
n.permu |
The number of permutation to be performed. |
wls.score.permu |
A vector with element being WLS score from permutation which can be obtained using function permutation: e.g. wls.score.permu <- permutation(phenotype,genotype,n.permu=1000,process=TRUE,fileName="test",t.thres=3) |
process |
If TRUE, it prints which step has been finished. Default = TRUE. |
t.thres |
threshold for deciding significant QTLs (t.test) that will be used to detecting mislabled samples |
GT.ref |
reference gentoype data from a large collection of strains. This is used to search for best mached gentoype for identified mislabeled samples. Default= NULL. If GT.ref is NULL, the orginal input genotype data willl beused to seach for best matched genotype for identified mislabeled samples. |
An object of class wls. A list with elements:
wls.score |
a vector with length being the number of samples; each element gives the score for the sample being mislabeled |
wls.names |
the names of sample that being detected as mislabeled using the Z score method |
gt.opt |
recovered the optimal genotype based on the given phenotype data |
wls.pValue |
p value for each sample using permutation, only when |
wls.score.permu |
a vector with the length of n.permu. Each element represents the score of a randomly selected sample with permutated genotype, only when |
thres |
threshold used probability threshold to decide if a sample is mislabled based on permutation result |
Yang Li <yang.li@rug.nl>
Li Y. et al, reGenotyper: detecting mislabeled samples in genetic data (submitted)
optimalGT
, permutation
,
tMatFunction
,genotype
,
phenotype
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | library(reGenotyper)
#load example genotype and phenotype data
data(genotype)
data(phenotype)
### For this test dataset 5 permutations is enough. In real case at least few hundreds
### of permutations are needed.
wlsObject <- reGenotyper(phenotype, genotype, fileName = "test", thres = 0.9, optGT = TRUE,
optGTplot = FALSE, optGT.thres = 0, permu = TRUE, n.permu = 5, wls.score.permu = NULL,
process = TRUE, t.thres = 1.5, GT.ref=NULL)
###Inspecting the output
wlsObject
plot(wlsObject)
### previous line takes around 30s to execute, you can also load the result:
data(wlsObject)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.