NewtonRaphson Algorithm
Description
This function is used internally in the function mmer
when MORE than 1 variance component needs to be estimated through the use of the NewtonRaphson (NR22) algorithm.
Usage
1 2 3 4 5 
Arguments
y 
a numeric vector for the response variable 
X 
an incidence matrix for fixed effects. 
ZETA 
an incidence matrix for random effects. This can be for one or more random effects. This NEEDS TO BE PROVIDED AS A LIST STRUCTURE. For example Z=list(list(Z=Z1, K=K1), list(Z=Z2, K=K2), list(Z=Z3, K=K3)) makes a 2 level list for 3 random effects. The general idea is that each random effect with or without its variancecovariance structure is a list, i.e. list(Z=Z1, K=K1) where Z is the incidence matrix and K the varcov matrix. When moving to more than one random effect we need to make several lists that need to be inside another list. What we call a 2level list, i.e. list(Z=Z1, K=K1) and list(Z=Z2, K=K2) would need to be put in the form; list(list(Z=Z1, K=K1),list(Z=Z1, K=K1)), which as can be seen, is a list of lists (2level list). 
R 
a matrix for variancecovariance structures for the residuals, i.e. for longitudinal data. if not passed is assumed an identity matrix. 
draw 
a TRUE/FALSE value indicating if a plot of updated values for the variance components and the likelihood should be drawn or not. The default is TRUE. COMPUTATION TIME IS SMALLER IF YOU DON'T PLOT SETTING draw=FALSE 
REML 
a TRUE/FALSE value indicating if restricted maximum likelihood should be used instead of ML. The default is TRUE. 
silent 
a TRUE/FALSE value indicating if the function should draw the progress bar or iterations performed while working or should not be displayed. 
iters 
a scalar value indicating how many iterations have to be performed if the EM is performed. There is no rule of tumb for the number of iterations. The default value is 100 iterations or EM steps. 
constraint 
a TRUE/FALSE value indicating if the program should use the boundary constraint when one or more variance component is close to the zero boundary. The default is TRUE but needs to be used carefully. It works ideally when few variance components are close to the boundary but when there are too many variance components close to zero we highly recommend setting this parameter to FALSE since is more likely to get the right value of the variance components in this way. 
init 
vector of initial values for the variance components. By default this is NULL and variance components are estimated by the method selected, but in case the user want to provide initial values this argument is functional. 
sherman 
a TRUE/FALSE value indicating if ShermanMorrisonWoodbury formula (Seber, 2003, p. 467) should be used when estimating variance components in order to perform faster when a mixed model with no covariance structure using the average information algorithm is fitted. The default is FALSE since this software was designed for unreplicated data (altough can fit models with replicated data but slower than lme4). 
che 
a TRUE/FALSE value indicating if list structure provided by the user is correct to fix it. The default is TRUE but is turned off to FALSE within the mmer function which would imply a double check. 
MTG2 
a TRUE/FALSE value indicating if an eigen decomposition for the additive relationship matrix should be performed or not. This is based on Lee (2015). The limitations of this methos are: 1) can only be applied to one relationship matrix 2) The system needs to be squared and no missing data is allowed (then missing data is imputed with the median). The default is FALSE to avoid the user get into trouble but experimented users can take advantage from this feature to fit big models, i.e. 5000 individuals in 555 seconds = 9 minutes in a MacBook 4GB RAM. 
Fishers 
a TRUE/FALSE value indicating if the program should calculate at the final step and return the inverse of the Fishers Information Matrix. 
gss 
a TRUE/FALSE value indicating if a genomic selection is being fitted just for using certain constraints. When is FALSE (default) the program can make some EM steps to find initial values for variance components when the starting values are to far from the real values causing the likelihood to have a strange behavior and dropping dramatically When TRUE the program does not try EM steps even when far away from the likelihood because in big markerbased models can make the process quite slow. 
forced 
a vector of numeric values for variance components including error if the user wants to force the values of the variance components. On the meantime only works for forcing all of them and not a subset of them. The default is NULL, meaning that variance components will be estimated by REML/ML. 
identity 
Logical variable, includes the identity as the final matrix of the covariance structure. Default is TRUE 
kernel 
Compute the log likelihood based on a reduced observation TY where T has this kernel. Default value of NULL assumes that the kernal matches the fixed effects model matrix X corresponding to REML. Setting kernel=0 gives the ordinary likelihood and kernel=1 gives the one dimensional subspace of constant vectors. See examples for more details. 
start 
Specify the variance components at which the
NewtonRaphson algorithm starts. Default value is

taper 
The proportion of each step to take. A vector of values from 0 to 1 of length maxcyc. Default value takes smaller steps initially. 
verbose 
Controls level of time output, takes values 0, 1 or 2, Default is 0, level 1 gives parameter estimates and value of log likelihood at each stage. 
gamVals 
When k=2, the marginal log likelihood based on the
residual configuration statistic (see Tunnicliffe Wilson(1989)), is
evaluated first at 
maxcyc 
Maximum number of cycles allowed. Default value is 50. A warning is output to the screen if this is reached before convergence. 
tol 
Convergence criteria. If the change in residual log
likelihood for one cycle is less than 
Details
This algorithm is based on Tunnicliffe (1989), it is based on REML. This handles models of the form:
.
y = Xb + Zu + e
.
b ~ N[b.hat, 0] ............zero variance because is a fixed term
u ~ N[0, K*sigma(u)] .......where: K*sigma(u) = G
e ~ N[0, I*sigma(e)] .......where: I*sigma(e) = R
y ~ N[Xb, var(Zu+e)] ......where;
var(y) = var(Zu+e) = ZGZ+R = V which is the phenotypic variance
.
The function allows the user to specify the incidence matrices with their respective variancecovariance matrix in a 2 level list structure. For example imagine a mixed model with the following design:
.
fixed = only intercept.....................b ~ N[b.hat, 0]
random = GCA1 + GCA2 + SCA.................u ~ N[0, G]
.
where G is:
.
K*sigma(gca1).....................0..........................0.........
.............0.............S*sigma(gca2).....................0......... = G
.............0....................0......................W*sigma(sca)..
.
The likelihood function optimized in this algorithm is:
.
logL = 0.5 * (log(  V  ) + log(  X'VX  ) + y'Py
.
where:   refers to the derminant of a matrix
.
Value
If all parameters are correctly indicated the program will return a list with the following information:
 $Vu
a scalar value for the variance component estimated
 $Ve
a scalar value for the error variance estimated
 $V.inv
a matrix with the inverse of the phenotypic variance V = ZGZ+R, V^1
 $u.hat
a vector with BLUPs for random effects
 $Var.u.hat
a vector with variances for BLUPs
 $PEV.u.hat
a vector with predicted error variance for BLUPs
 $beta.hat
a vector for BLUEs of fixed effects
 $Var.beta.hat
a vector with variances for BLUEs
 $X
incidence matrix for fixed effects, if not passed is assumed to only include the intercept
 $Z
incidence matrix for random effects, if not passed is assumed to be a diagonal matrix
 $K
the varcov matrix for the random effect fitted in Z
 $ll
the loglikelihood value for obtained when optimizing the likelihood function when using ML or REML
References
Tunnicliffe W. 1989. On the use of marginal likelihood in time series model estimation. JRSS 51(1):1527.
See Also
The core functions of the package mmer
and mmer2
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45  ####=========================================####
#### For CRAN time limitations most lines in the
#### examples are silenced with one '#' mark,
#### remove them and run the examples
####=========================================####
####=========================================####
#### breeding values with 3 variance components
####=========================================####
####=========================================####
## Import phenotypic data on inbred performance
## Full data
####=========================================####
data(cornHybrid)
hybrid2 < cornHybrid$hybrid # extract cross data
A < cornHybrid$K # extract the varcov K
y < hybrid2$Yield
X1 < model.matrix(~ Location, data = hybrid2);dim(X1)
Z1 < model.matrix(~ GCA1 1, data = hybrid2);dim(Z1)
Z2 < model.matrix(~ GCA2 1, data = hybrid2);dim(Z2)
Z3 < model.matrix(~ SCA 1, data = hybrid2);dim(Z3)
####=========================================####
#### Realized IBS relationships for set of parents 1
####=========================================####
K1 < A[levels(hybrid2$GCA1), levels(hybrid2$GCA1)]; dim(K1)
####=========================================####
#### Realized IBS relationships for set of parents 2
####=========================================####
K2 < A[levels(hybrid2$GCA2), levels(hybrid2$GCA2)]; dim(K2)
####=========================================####
#### Realized IBS relationships for cross
#### (as the Kronecker product of K1 and K2)
####=========================================####
S < kronecker(K1, K2) ; dim(S)
rownames(S) < colnames(S) < levels(hybrid2$SCA)
ETA < list(list(Z=Z1, K=K1), list(Z=Z2, K=K2), list(Z=Z3, K=S))
####=========================================####
#### run the next line, it was ommited for CRAN time limitations
####=========================================####
#ans < NR22(y=y, ZETA=ETA)
#ans$var.comp
