KAML-package | R Documentation |
KAML is originally designed to predict phenotypic value using genome- or chromosome-wide SNPs for sample traits which are controled by limited major markers or complex traits that are influenced by many minor-polygene. In brief, KAML incorporates pseudo QTNs as fixed effects and a trait-specific K matrix as random effect in a mixed linear model. Both pseudo QTNs and trait-specific K matrix are optimized using a parallel-accelerated machine learning strategy.
KAML(pfile="", gfile="", kfile=NULL, dcovfile=NULL, qcovfile=NULL,
pheno=1, SNP.weight=NULL, GWAS.model=c("MLM","GLM", "RR"), GWAS.npc=NULL,
prior.QTN=NULL, prior.model=c("QTN+K", "QTN", "K"),
vc.method=c("brent", "he", "emma"),
Top.perc=c(1e-4, 1e-3, 1e-2, 1e-1), Top.num=15,
Logx=c(1.01, 1.11, exp(1), 10), qtn.model=c("MR", "SR", "BF"),
BF.threshold=NULL, binary=FALSE, bin.size=1000000, max.nQTN=TRUE,
sample.num=2, SNP.filter=NULL, crv.num=5, cor.threshold=0.3,
count.threshold=0.9, step=NULL,
bisection.loop=10, ref.gwas=TRUE,
theSeed=666666, file.output=TRUE, cpu=10
)
pfile |
phenotype file, one column for a trait, the name of each column must be provided(NA is allowed) |
gfile |
genotype files, including "gfile.geno.desc", "gfile.geno.bin" and "gfile.map" |
kfile |
n*n, optional, provided KINSHIP file for all individuals |
dcovfile |
n*x, optional, the provided discrete covariates file |
qcovfile |
n*x, optional, the provided quantitative covariates file |
pheno |
specify phenotype column in the phenotype file(default 1) |
SNP.weight |
provided weights of all SNPs |
GWAS.model |
which model will be used for GWAS(only "GLM" and "MLM" can be selected presently) |
GWAS.npc |
the number of PC that will be added as covariance to control population structure |
prior.QTN |
the prior QTNs which will be added as covariants, if provided prior QTNs, KAML will not optimize QTNs and model during cross-validation |
prior.model |
the prior Model for the prior.QTN that added as covariants |
vc.method |
method for variance components estimation("brent", "he", "emma", "ai") |
Top.perc |
a vector, a subset of top SNPs for each iteration are amplified when calculating KINSHIP |
Top.num |
a number, a subset of top SNPs for each iteration are used as covariants |
Logx |
a vector, the base for LOG |
qtn.model |
the strategy of selecting pseudo QTNs. c("MR", "SR", "BF") |
BF.threshold |
the threshold of BF method |
binary |
whether the phenotype is case-control |
bin.size |
the size of each bin |
max.nQTN |
whether limits the max number of Top.num |
sample.num |
the sample number of cross-validation |
SNP.filter |
the SNPs whose P-value below this threshold will be deleted |
crv.num |
the cross number of cross-validation |
cor.threshold |
if the top SNP which is used as covariant is in high correlation with others, it will be deleted |
count.threshold |
if the count of selected SNP for all iteration >= sample.num*crv.num*count.threshold, than it will be treated as covariance in final predicting model |
step |
to control the memory usage |
bisection.loop |
the max loop(iteration) number of bisection algorithm |
ref.gwas |
whether to do GWAS for reference population(if not, KAML will merge all GWAS results of cross-validation by mean) |
theSeed |
the random seed |
file.output |
whether to write the predicted values in file |
cpu |
the number of CPU for calculation |
Package: | KAML |
Type: | Package |
Version: | 1.2.0 |
Date: | 2021-11-04 |
License: | GPL(>=3) |
Lilin Yin, Haohao Zhang and Xiaolei Liu
Maintainer:
Lilin Yin <ylilin@163.com>
Xiaolei Liu <xiaoleiliu@mail.hzau.edu.cn>
Please see at: https://github.com/YinLiLin/KAML
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.