View source: R/EMM_functions_cpp.R

EM3.general | R Documentation |

This function solves the following multi-kernel linear mixed effects model
using `MMEst`

function in 'MM4LMM' package,
`lmm.aireml`

or `lmm.diago`

functions in 'gaston' package,
or `EM3.cpp`

function in 'RAINBOWR' package.

`y = X \beta + \sum _{l=1} ^ {L} Z _ {l} u _ {l} + \epsilon`

where `Var[y] = \sum _{l=1} ^ {L} Z _ {l} K _ {l} Z _ {l}' \sigma _ {l} ^ 2 + I \sigma _ {e} ^ {2}`

.

```
EM3.general(
y,
X0 = NULL,
ZETA,
eigen.G = NULL,
package = "gaston",
tol = NULL,
n.core = 1,
optimizer = "nlminb",
REML = TRUE,
pred = TRUE,
return.u.always = TRUE,
return.u.each = TRUE,
return.Hinv = TRUE,
recheck.RAINBOWR = TRUE,
var.ratio.range = c(1e-09, 1e+07)
)
```

`y` |
A |

`X0` |
A |

`ZETA` |
A list of variance matrices and its design matrices of random effects. You can use more than one kernel matrix. For example, ZETA = list(A = list(Z = Z.A, K = K.A), D = list(Z = Z.D, K = K.D)) (A for additive, D for dominance) Please set names of lists "Z" and "K"! |

`eigen.G` |
A list with - $values
Eigen values - $vectors
Eigen vectors
The result of the eigen decompsition of |

`package` |
Package name to be used in this function. We only offer the following three packages: "RAINBOWR", "MM4LMM" and "gaston". Default package is 'gaston'. |

`tol` |
The tolerance for detecting linear dependencies in the columns of G = ZKZ'. Eigen vectors whose eigen values are less than "tol" argument will be omitted from results. If tol is NULL, top 'n' eigen values will be effective. |

`n.core` |
Setting n.core > 1 will enable parallel execution on a machine with multiple cores. (‘n.core' will be replaced by 1 for 'package = ’gaston'') |

`optimizer` |
The function used in the optimization process. We offer "optim", "optimx", and "nlminb" functions. This argument is only valid when ‘package = ’RAINBOWR''. |

`REML` |
You can choose which method you will use, "REML" or "ML". If REML = TRUE, you will perform "REML", and if REML = FALSE, you will perform "ML". |

`pred` |
If TRUE, the fitting values of y is returned. |

`return.u.always` |
When using the "gaston" package with missing values or
using the "MM4LMM" package (with/without missings), computing BLUP will take
some time in addition to solving the mixed-effects model. You can choose
whether BLUP ('u'; |

`return.u.each` |
If TRUE, the function also computes each BLUP corresponding to different kernels (when solving multi-kernel mixed-effects model). It takes additional time compared to the one with 'return.u.each = FALSE' when using packages other than 'RAINBOWR'. |

`return.Hinv` |
If TRUE, |

`recheck.RAINBOWR` |
When you use the package other than 'RAINBOWR' and the ratio of variance components is out of the range of 'var.ratio.range', the function will solve the mixed-effects model again with 'RAINBOWR' package, if 'recheck.RAINBOWR = TRUE'. |

`var.ratio.range` |
The range of variance components to check that the results by the package other than RAINBOWR is correct or not when 'recheck.RAINBOWR = TRUE'. |

- $y.pred
The fitting values of y

`y = X\beta + Zu`

- $Vu
Estimator for

`\sigma^2_u`

, all of the genetic variance- $Ve
Estimator for

`\sigma^2_e`

- $beta
BLUE(

`\beta`

)- $u
BLUP(Sum of

`Zu`

)- $u.each
BLUP(Each

`u`

)- $weights
The proportion of each genetic variance (corresponding to each kernel of ZETA) to Vu

- $LL
Maximized log-likelihood (full or restricted, depending on method)

- $Vinv
The inverse of

`V = Vu \times ZKZ' + Ve \times I`

- $Hinv
The inverse of

`H = ZKZ' + \lambda I`

Kang, H.M. et al. (2008) Efficient Control of Population Structure in Model Organism Association Mapping. Genetics. 178(3): 1709-1723.

Zhou, X. and Stephens, M. (2012) Genome-wide efficient mixed-model analysis for association studies. Nat Genet. 44(7): 821-824.

Johnson, D. L., & Thompson, R. (1995). Restricted maximum likelihood estimation of variance components for univariate animal models using sparse matrix techniques and average information. Journal of dairy science, 78(2), 449-456.

Hunter, D. R., & Lange, K. (2004). A tutorial on MM algorithms. The American Statistician, 58(1), 30-37.

Zhou, H., Hu, L., Zhou, J., & Lange, K. (2015). MM algorithms for variance components models. arXiv preprint arXiv:1509.07426.

Gilmour, A. R., Thompson, R., & Cullis, B. R. (1995), Average information REML: an efficient algorithm for variance parameter estimation in linear mixed models, Biometrics, 1440-1450.

`MMEst`

, `lmm.aireml`

, `lmm.diago`

```
### Import RAINBOWR
require(RAINBOWR)
### Load example datasets
data("Rice_Zhao_etal")
Rice_geno_score <- Rice_Zhao_etal$genoScore
Rice_geno_map <- Rice_Zhao_etal$genoMap
Rice_pheno <- Rice_Zhao_etal$pheno
### View each dataset
See(Rice_geno_score)
See(Rice_geno_map)
See(Rice_pheno)
### Select one trait for example
trait.name <- "Flowering.time.at.Arkansas"
y <- as.matrix(Rice_pheno[, trait.name, drop = FALSE])
### Remove SNPs whose MAF <= 0.05
x.0 <- t(Rice_geno_score)
MAF.cut.res <- MAF.cut(x.0 = x.0, map.0 = Rice_geno_map)
x <- MAF.cut.res$x
map <- MAF.cut.res$map
### Estimate additive genomic relationship matrix (GRM) & epistatic relationship matrix
K.A <- calcGRM(genoMat = x)
K.AA <- K.A * K.A ### additive x additive epistatic effects
### Modify data
Z <- design.Z(pheno.labels = rownames(y),
geno.names = rownames(K.A)) ### design matrix for random effects
pheno.mat <- y[rownames(Z), , drop = FALSE]
ZETA <- list(A = list(Z = Z, K = K.A),
AA = list(Z = Z, K = K.AA))
### Solve multi-kernel linear mixed effects model using gaston package (2 random efects)
EM3.gaston.res <- EM3.general(y = pheno.mat, X0 = NULL, ZETA = ZETA,
package = "gaston", return.u.always = TRUE,
pred = TRUE, return.u.each = TRUE,
return.Hinv = TRUE)
(Vu <- EM3.gaston.res$Vu) ### estimated genetic variance
(Ve <- EM3.gaston.res$Ve) ### estimated residual variance
(weights <- EM3.gaston.res$weights) ### estimated proportion of two genetic variances
(herit <- Vu * weights / (Vu + Ve)) ### genomic heritability (additive, additive x additive)
(beta <- EM3.gaston.res$beta) ### Here, this is an intercept.
u.each <- EM3.gaston.res$u.each ### estimated genotypic values (additive, additive x additive)
See(u.each)
### Perform genomic prediction with 10-fold cross validation using gaston package (multi-kernel)
noNA <- !is.na(c(pheno.mat)) ### NA (missing) in the phenotype data
phenoNoNA <- pheno.mat[noNA, , drop = FALSE] ### remove NA
ZETANoNA <- ZETA
ZETANoNA <- lapply(X = ZETANoNA, FUN = function (List) {
List$Z <- List$Z[noNA, ]
return(List)
}) ### remove NA
nFold <- 10 ### # of folds
nLine <- nrow(phenoNoNA)
idCV <- sample(1:nLine %% nFold) ### assign random ids for cross-validation
idCV[idCV == 0] <- nFold
yPred <- rep(NA, nLine)
for (noCV in 1:nFold) {
print(paste0("Fold: ", noCV))
yTrain <- phenoNoNA
yTrain[idCV == noCV, ] <- NA ### prepare test data
EM3.gaston.resCV <- EM3.general(y = yTrain, X0 = NULL, ZETA = ZETANoNA,
package = "gaston", return.u.always = TRUE,
pred = TRUE, return.u.each = TRUE,
return.Hinv = TRUE) ### prediction
yTest <- EM3.gaston.resCV$y.pred ### predicted values
yPred[idCV == noCV] <- yTest[idCV == noCV]
}
### Plot the results
plotRange <- range(phenoNoNA, yPred)
plot(x = phenoNoNA, y = yPred,xlim = plotRange, ylim = plotRange,
xlab = "Observed values", ylab = "Predicted values",
main = "Results of Genomic Prediction (multi-kernel)",
cex.lab = 1.5, cex.main = 1.5, cex.axis = 1.3)
abline(a = 0, b = 1, col = 2, lwd = 2, lty = 2)
R2 <- cor(x = phenoNoNA[, 1], y = yPred) ^ 2
text(x = plotRange[2] - 10,
y = plotRange[1] + 10,
paste0("R2 = ", round(R2, 3)),
cex = 1.5)
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.