qtlrelPerChr | R Documentation |
Given I genotypes, P SNPs, Q covariates (e.g. replicates) and N=I*Q phenotypes, the whole likelihood is: y = W alpha + Z X beta + epsilon, where y is Nx1, W is NxQ, Z is NxI, X is IxP and u = X beta. The QTLRel package (Cheng et al, BMC Genetics, 2011) decomposes the inference into an ad hoc procedure of two steps. First, the variance components and fixed effects are estimated: y = W alpha + Z u + epsilon. Second the allele effects are tested SNP per SNP: for all p in 1,..,P, y = W alpha + Z x_p beta_p + Z u + epsilon, using the fixed effects and variance components estimated in the first step. As the SNPs can be in linkage disequilibrium, it is advised (Yang et al, Nature Genetics, 2014) to perform the procedure chromosome per chromosome, which is the goal of this function; but I advise to also run it with chr.ids=NULL.
qtlrelPerChr(
y,
X,
snp.coords,
thresh = 0.01,
chr.ids = NULL,
W = NULL,
Z = NULL,
method.A = "vanraden1",
verbose = 1
)
y |
vector or one-column matrix of phenotypes (the order is important and should be in agreement with the other arguments X, W and Z) |
X |
matrix of bi-allelic SNP genotypes encoded in allele doses in 0,1,2, with genotypes in rows and SNPs in columns |
snp.coords |
data.frame with SNP identifiers as row names and two columns named "chr" and "coord" (or "pos") |
thresh |
threshold on minor allele frequencies below which SNPs are ignored via |
chr.ids |
vector of chromosome identifiers to analyze (if NULL, the regular QTLRel procedure is launched, i.e. all chromosomes are used to estimate the variance components) |
W |
incidence matrix of covariates (should not contain the column of 1's for the intercept; use |
Z |
incidence matrix relating phenotypes to genotypes (if nrow(y) and nrow(X) are different, diagonal otherwise; use |
method.A |
method to estimate the additive relationships (see |
verbose |
verbosity level (0/1) |
a list with three data.frames as components, variance.components, fixed.effects and scan
Timothee Flutre
## Not run: ## simulate data
set.seed(1859)
I <- 200
P <- 2000
Q <- 3
N <- Q * I
dat <- data.frame(ind=as.factor(rep(paste0("ind", 1:I), times=Q)),
year=as.factor(rep(paste0(2003:(2003+Q-1)), each=I)))
W <- stats::model.matrix(~ year, dat)
alpha <- rnorm(n=Q, mean=50, sd=30)
X <- simulGenosDose(nb.genos=I, nb.snps=P)
beta <- rnorm(n=P, mean=0, sd=2)
Z <- stats::model.matrix(~ ind - 1, dat)
dat$response <- as.vector(W %*% alpha + Z %*% X %*% beta + rnorm(N))
snp.coords <- data.frame(chr=sample(paste0("chr",1:10), P, TRUE),
coord=sample.int(10^6, P))
rownames(snp.coords) <- colnames(X)
## perform the inference
library(QTLRel)
res <- qtlrelPerChr(dat$response, X, snp.coords, 0.01, "chr1", W=W[,-1], Z=Z)
## res <- qtlrelPerChr(dat$response, X, snp.coords, 0.01, NULL, W=W[,-1], Z=Z)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.