em: Expectation maximization algorithm to impute missing GxE...
In nsantantonio/Bilinear: Fits bilinear models for multi environment trial data

Description Usage Arguments Details Value Examples

This function will impute missing cells in the GxE table using an expectation maximization algorithm.

1 2	em(Y, model, tol = 1e-04, maxiter = 100, k = NULL, fast = TRUE, Ytrue = NULL, plotMSE = FALSE, verbose = FALSE, ...)

`Y`	matrix containing numeric values of cell means with genotypes on rows and environments on columns
`model`	character vector of length 1. bilinear model to be fit. Arguments can be "AMMI", "GGE", "SREG", "EGE", "GREG". "GGE" and "SREG" are equivalent, as are "EGE" and "GREG".
`tol`	scalar convergence tolerance threshold, defined as the sum of the absolute value of cell mean differences from iteration i and i-1 scaled by the standard deviation of the values in Y.
`maxiter`	integer. Maximum number of iterations.
`k`	number of PC to use for imputation. Default is NULL, k will be determined from the imputed data using the parametric bootstrap test.
`fast`	logical or integer. If false or 0, k will be deterined at each iteration (slow). If fast is non-zero, k will be estimated each iteration <= max(2, fast), and the last value of k will be used for remaining iterations. .
`Ytrue`	Same as Y but with known, non-mising values. This allows the user to evaluate the accuracy of imputation.
`plotMSE`	logical. Should the mean square error (MSE) be plotted?.
`verbose`	logical. Should details be printed?
`...`	Additional arguments.

Missing values in the table of genotypes and environments are imputed using an expectation maximization algorithm. The algorithm exits and returns the imputed matrix once a tolerance threshold or maximum number of iterations is reached. This function is generally meant to be used by bilinear when missing cells are found, but the user can also use it to determine the imputation accuracy by providing the true values to 'Ytrue'.

If 'k' is set to an integer, then this number of PCs will be used for imputation. Otherwise, 'k' will be determined from the model fit using the 'test' argument provided to bilinear.

If 'fast' is set to TRUE, then the test will only be done for the first 2 iterations. If an integer is provided to 'fast', 'k' will be determined for the first 'fast' iterations.

If a complete matrix of true values is provided, the algorithm will calculate the mean square error. Additionally, if plot MSE is set to true, the MSE of each iteration will be plotted as the algorithm proceeds

If 'verbose' is true, details will be printed to stdout.

Matrix with missing cells replaced by imputed values.

data(soyMeanMat)
nMiss <- 10 
Ytrue <- soyMeanMat
Y <- soyMeanMat
Y[sample(1:prod(dim(Y)), nMiss)] <- NA

em(Y, model = "AMMI", tol = 1e-5, k = 1, maxiter = 20, Ytrue = Ytrue, plotMSE = TRUE)
em(Y, model = "AMMI", tol = 1e-5, k = 2, maxiter = 20, Ytrue = Ytrue, plotMSE = TRUE)
em(Y, model = "AMMI", tol = 1e-5, fast = FALSE, maxiter = 20, Ytrue = Ytrue, plotMSE = TRUE)
em(Y, model = "AMMI", tol = 1e-5, fast = 2, maxiter = 20, Ytrue = Ytrue, plotMSE = TRUE)