association.test | R Documentation |
Association tests between phenotype and SNPs.
association.test(x, Y = x@ped$pheno, X = matrix(1, nrow(x)),
method = c("lm", "lmm"), response = c("quantitative", "binary"),
test = c("score", "wald", "lrt"), K, eigenK, beg = 1,
end = ncol(x), p = 0, tol = .Machine$double.eps^0.25, ...)
x |
A |
Y |
The phenotype vector. Default is the column ( |
X |
A covariable matrix. The default is a column vector of ones, to include an intercept in the model |
method |
Method to use: |
response |
Is |
test |
Which test to use. For binary phenotypes, |
K |
A Genetic Relationship Matrix (as produced by |
eigenK |
Eigen decomposition of the Genetic Relationship Matrix (as produced by the function |
beg |
Index of the first SNP tested for association |
end |
Index of the last SNP tested for association |
p |
Number of Principal Components to include in the model with fixed effect (for |
tol |
Parameter for the likelihood maximization (as in |
... |
Additional parameters for |
Tests the association between the phenotype and requested SNPs in x
.
If method = "lm"
and response = "quantitative"
are used, a simple linear regression
is performed to test each SNP in the considered interval. Precisely, the following model is
considered for each SNP,
Y = (X|PC)\alpha + G\beta + \varepsilon
with \varepsilon \sim N(0,\sigma^2 I_n)
,
G
the genotype vector of the SNP,
X
the covariates matrix, and PC
the matrix of the first p
principal components.
A Wald test is performed, independently of the value of test
.
Ifmethod = "lm"
and response = "binary"
, a similar model is used for a logistic
regression (Wald test).
If method = "lmm"
and response = "quantitative"
, the following model in considered for each SNP
Y = (X|PC)\alpha + G\beta + \omega + \varepsilon
with \omega \sim N(0,\tau K)
and \varepsilon \sim N(0,\sigma^2 I_n)
.
G
is the genotype vector of the SNP, K
is a Genetic Relationship Matrix (GRM)
X
the covariates matrix, and PC
the matrix of the first p
principal components.
If test = "score"
, all parameters are estimated with the same procedure as in
lmm.aireml
and the argument K
is used to specify the GRM matrix (or a list of GRM
matrices for inclusion of several random effects in the model). If p
is positive, the paramater eigenK
needs to be given as well.
For Wald and LRT tests the procedure used is the same as in lmm.diago
and eigenK
is used to
specify the GRM matrix.
If method = "lmm"
and response = "binary"
, the following model in considered for each SNP
\mbox{logit}(P[Y=1| X, G, \omega]) = X\alpha + G\beta + \omega
with \omega \sim N(0,\tau K)
.
G
is the genotype vector of the SNP, K
is a Genetic Relationship Matrix (GRM),
X
the covariable matrix. A score test is performed, independently of the value of test
.
All parameters under null model are estimated with the same procedure as in logistic.mm.aireml
.
In case of convergence problems of the null problem, the user can try several starting values (in particular
with parameter tau
, trying e.g. tau = 0.1
or another value).
It is possible to give a list of matrices in parameter K
for inclusion of several random effects in the model.
If p
is positive, the paramater eigenK
needs to be given as well.
Note: this function is not multithreaded. Wald test with Linear Mixed Models are computationally intensive,
to run a GWAS with such tests consider using association.test.parallel
in package gaston.utils
(on github). Association tests with dosages can be done with association.test.dosage
and
association.test.dosage.parallel
in the same package.
A data frame, giving for each considered SNP, its position, id, alleles, and
some of the following columns depending on the values of method
and test
:
score |
Score statistic for each SNP |
h2 |
Estimated value of |
beta |
Estimated value of |
sd |
Estimated standard deviation of the |
LRT |
Value of the Likelihood Ratio Test |
p |
The corresponding p-value |
qqplot.pvalues
, manhattan
, lmm.diago
,
lmm.aireml
, logistic.mm.aireml
, optimize
# Load data
data(TTN)
x <- as.bed.matrix(TTN.gen, TTN.fam, TTN.bim)
standardize(x) <- "p"
# simulate quantitative phenotype with effect of SNP #631
set.seed(1)
y <- x %*% c(rep(0,630),0.5,rep(0,ncol(x)-631)) + rnorm(nrow(x))
# association test with linear model
test <- association.test(x, y, method="lm", response = "quanti")
# a p-values qq plot
qqplot.pvalues(test)
# a small Manhattan plot
# hihlighting the link between p-values and LD with SNP #631
lds <- LD(x, 631, c(1,ncol(x)))
manhattan(test, col = rgb(lds,0,0), pch = 20)
# use y to simulate a binary phenotype
y1 <- as.numeric(y > mean(y))
# logistic regression
t_binary <- association.test(x, y1, method = "lm", response = "binary")
# another small Manhattan plot
manhattan(t_binary, col = rgb(lds,0,0), pch = 20)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.