opts_chunk$set(comment = NA, results = 'markup', tidy = F, message = F, warning = F, echo = T, cache = T, fig.width = 6, fig.height = 6)
library(devtools) load_all("~/git/variani/matlm/") load_all("~/git/variani/qq/")
library(pander) library(ggplot2) theme_set(theme_light())
N <- 2e3 M <- 2e3 seed <- 1 rho <- 0.9
simpred_uncor <- matlm_sim_randpred(seed = seed, N = N, M = M)
assoc_uncor <- with(simpred_uncor, matlm(form, dat, pred = pred))
qq_plot(assoc_uncor$tab$pval)
simpred_cor <- matlm_sim_randpred(seed = seed, N = N, M = M, rho = rho)
assoc_cor <- with(simpred_cor, matlm(form, dat, pred = pred))
qq_plot(assoc_cor$tab$pval)
The covariance matrix is pre-defined in matlm_sim_randpred
function
and has a simple form:
rho
(equal here to r rho
).cmat <- matlm_sim_randpred(seed = seed, N = N, M = M, rho = rho, ret = "mat")
# number of predictors M # dimenstions of matrix dim(cmat) # a sub-matrix cmat[1:5, 1:5]
It can be shown [@Joo2016] that the covariance matrix among test statistics (t-test) $s_i$ is the correlation matrix among predictors $x_i$:
$cov(s_i, s_j) = cor(x_i, x_j)$
This basic relationship is true for the simplest linear regression model:
$y = \mu + \beta x_i + e$
$e \sim \mathcal{N}(0, \sigma_e^2)$
In other cases, e.g. related observations [@Joo2016], some modifications are required.
C <- matrix(rho, M, M) diag(C) <- 1 ch <- chol(C) ch_inv <- solve(ch)
s_assoc <- assoc_cor$tab$zscore s_corrected <- as.numeric(s_assoc %*% ch_inv) pvals_corrected <- pchisq(s_corrected^2, 1, lower.tail = FALSE)
qq_plot(pvals_corrected)
[@Conneely2007] introduced $P_{act}$:
To calculate $P_{perm}$, we first created 1,000 permutations of the original data by randomly shuffling individual genotype vectors while leaving the trait data and any covariates intact. In this way, the permuted samples simulated the null hypothesis of no association but maintained the original correlation between genotypes, between traits, and between traits and covariates. We tested each of these 1,000 samples for association and estimated $P_{perm}$ as the proportion of samples with a $P_{min}$ value as low as that observed in the original data
data(pacts) tab <- with(pacts, within(tab, { alpha <- alpha pval_BF <- alpha / M N <- N N_effective <- round(N * (pact[1] / pact), 0) }))
pander(tab)
In our simulations, we computed $P_{act}$ as an alternative to the Bonferroni correction,
while varying the value of rho
.
r pacts$N
r pacts$M
r pacts$L
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.