Frequentist linear mixed model

Description

Maximum likelihood estimates for the parameters of a linear mixed model.

Usage

1
ghap.lmm(fixed,random,weights=NULL,env.eff=FALSE,data,K,verbose=TRUE)

Arguments

fixed

Formula describing the fixed effects part of the model, e.g. y ~ a + b + c ... If the model does not include any covariate simply state the response variable with an intercept, i.e. y ~ 1.

random

A character value with name of the column containing labels for the random effects.

weights

A numeric vector with residual weights. If not supplied, all residual weights are set to 1.

env.eff

A logical value indicating if permanent environmental effects should be included (default=FALSE).

data

A dataframe containing the data.

K

A covariance matrix for random effects.

verbose

A logical value specfying whether log messages should be printed (default = TRUE).

Details

The function uses a frequentist framework to fit the following linear mixed model:

\mathbf{y} = \mathbf{Xb} + \mathbf{Zu} + \mathbf{Zp} + \mathbf{e}

where \mathbf{X} is a matrix relating \mathbf{y} to the vector of fixed effects \mathbf{b}, \mathbf{Z} is an incidence matrix relating \mathbf{y} to random effects \mathbf{u} and \mathbf{p}, and \mathbf{e} is the vector of residuals. The likelihood of the data is assumed:

\mathbf{y} \mid \mathbf{b},\mathbf{u},\mathbf{p},σ_{u}^{2},σ_{p}^{2},σ_{e}^2 \sim N(\mathbf{Xb},\mathbf{V})

where \mathbf{V} = \mathbf{ZKZ}'σ_{u}^2 + \mathbf{ZZ}'σ_{p}^2 + \mathbf{W}σ_{e}^2, \mathbf{K} is a covariance matrix for \mathbf{u}, σ_{u}^{2} and σ_{p}^{2} are the variances of \mathbf{u} and \mathbf{p}, respectively, \mathbf{W} is a residual covariance matrix and σ_{e}^{2} is the residual variance. The current implementation assumes \mathbf{W} = diag(w_i). More details about the maximization algorithm can be found in our vignette.

Value

The returned GHap.blmm object is a list with the following items:

b

A numeric vector containing the BLUE of fixed effects.

u

A numeric vector containing BLUP of correlated random effects.

p

A numeric vector containing the BLUP of permanent environmental effects. This vector is suppressed if env.eff=FALSE.

varu

A numeric value for the maximum likelihood estimate of the variance of correlated random effects.

varp

A numeric value for the maximum likelihood estimate of the variance of permanent environmental effects. This value is suppressed if env.eff=FALSE.

vare

A numeric value for the maximum likelihood estimate of the variance of residual variance.

h2

A numeric value for the maximum likelihood estimate of the variance explained by correlated random effects only.

H2

A numeric value for the maximum likelihood estimate of the variance explained by random effects. This value is suppressed if env.eff=FALSE.

k

A numeric vector containing the solutions for \mathbf{K}^{-1}\mathbf{\hat{u}}. This vector is used by the ghap.blup function.

y

A numeric vector containing the records used to fit the model.

weights

A numeric vector containing the residual weights used to fit the model.

residuals

A numeric vector containing residuals computed based on the BLUE and BLUP solutions.

pdev

Deviance evaluated at the BLUE and BLUP solutions.

Author(s)

Yuri Tani Utsunomiya <ytutsunomiya@gmail.com>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# #### DO NOT RUN IF NOT NECESSARY ###
# 
# # Copy the example data in the current working directory
# ghap.makefile()
# 
# # Load data
# phase <- ghap.loadphase("human.samples", "human.markers", "human.phase")
# 
# # Subset data - randomly select 3000 markers with maf > 0.02
# maf <- ghap.maf(phase, ncores = 2)
# set.seed(1988)
# markers <- sample(phase$marker[maf > 0.02], 3000, replace = FALSE)
# phase <- ghap.subsetphase(phase, unique(phase$id), markers)
# rm(maf,markers)
# 
# # Generate block coordinates based on windows of 10 markers, sliding 5 marker at a time
# blocks <- ghap.blockgen(phase, 10, 5, "marker")
# 
# # Generate matrix of haplotype genotypes
# ghap.haplotyping(phase, blocks, batchsize = 100, ncores = 2, freq = 0.05, outfile = "example")
# 
# # Load haplotype genotypes
# haplo <- ghap.loadhaplo("example.hapsamples", "example.hapalleles", "example.hapgenotypes")
# 
# # Compute kinship matrix
# K <- ghap.kinship(haplo, batchsize = 100)
# 
# # Quantitative trait with 50% heritability
# # One major haplotype accounting for 30% of the genetic variance
# sim <- ghap.simpheno(haplo = haplo, K = K, h2 = 0.5, g2 = 0.3, major = 1000,seed=1988)
# 
# 
# ### RUN ###
# 
# #Continuous model
# model <- ghap.lmm(fixed = phenotype ~ 1, random = "individual", data = sim$data, K = K)
# model$h2
# plot(model$u,sim$u, ylab="True Breeding Value", xlab="Estimated Breeding Value")
# cor(model$u,sim$u)