remlf90: Inference with REMLF90
In famuvie/breedR: Statistical Methods for Forest Genetic Resources Analysts

remlf90

R Documentation

Inference with REMLF90

Description

Fits a Linear Mixed Model by Restricted Maximum Likelihood

Usage

remlf90(
  fixed,
  random = NULL,
  genetic = NULL,
  spatial = NULL,
  generic = NULL,
  data,
  var.ini = NULL,
  method = c("ai", "em"),
  breedR.bin = breedR.getOption("breedR.bin"),
  progsf90.options = NULL,
  weights = NULL,
  debug = FALSE
)

Arguments

`fixed`	an object of class formula (or one that can be coerced to that class): a symbolic description of the fixed effects of the model to be fitted. The details of model specification are given under 'Details'.
`random`	if not `NULL`, an object of class formula with the unstructured random effects.
`genetic`	if not `NULL`, a list with relevant parameters for an additive genetic effect; see 'Details'.
`spatial`	if not `NULL`, a list with relevant parameters for a spatial random effect; see 'Details'.
`generic`	if not `NULL`, a named list with an `incidence` matrix and either a `covariance` or a `precision` matrix; see 'Details'.
`data`	a data frame with variables and observations
`var.ini`	if not `NULL`, a named list with one item for each random component in the model. See 'Details'.
`method`	either 'ai' or 'em' for Average-Information or Expectation-Maximization REML respectively
`breedR.bin`	character. The local directory where the package binaries are stored, or any of 'remote' or 'submit' for remote computing. See 'Details'.
`progsf90.options`	character. Passed directly to OPTIONS field in PROGSF90. See available options for REMLF90 and for AIREMLF90. Option `sol se` is passed always and cannot be removed. No checks are performed, handle with care.
`weights`	numeric. A vector of weights for the residual variance.
`debug`	logical. If `TRUE`, the input files for blupf90 programs and their output are shown, but results are not parsed.

Details

If either genetic or spatial are not NULL, the model residuals are assumed to have an additive genetic effects or a spatially structured random effect, respectively. In those cases, genetic and spatial must be lists with named relevant parameters.

The generic component implements random effects with custom incidence and covariance (or precision) matrices. There can be any number of them, stored in a named list with custom unique names that will be used to identify and label the results. Each effect in the list must have an incidence argument and either a covariance or a precision matrix with conforming and suitable dimensions. Optionally, an initial variance for the REML algorithm can be specified in a third argument var.ini.

Genetic effect

The available models for the genetic effect are add_animal and competition. add_animal stands for an additive animal model with a given pedigree. competition includes the direct additive genetic effect and also a competition additive genetic effect possibly correlated with the former.

The minimum elements in the list of the genetic component are:

model a string, either add_animal or competition
pedigree either an object of class pedigree (see build_pedigree) or a data.frame with exactly three columns identifying the individual, his father and his mother respectively
id either a vector of codes or the name of the variable with the individual identifier in the data

Optional common components are:

var.ini a positive initial value for the variance component(s). For a competition model, the same initial value is used for both effects, with a negative initial correlation between them of half this value.

Finally, for model competition there are further mandatory and optional elements:

mandatory elements
- coord a matrix, list or data.frame with two columns for the rows and columns of the observations, respectively. This element is necessary even if duplicated in a spatial component.
- competition_decay a positive number. The intensity of competition is weighted by the distance according to 1/d^\alpha. This element specifies the exponent \alpha to be used. Typically 1 or 2.
optional elements
- pec Permanent Environmental Competition effect. If present, this must be a named list with elements present which is either TRUE or FALSE and (optionally) var.ini specifying the initial variance for this effect.

The Permanent Environmental Competition (pec) effect is actually non-genetic in nature. However, it was included as an option to the (genetic) competition effect as it is usually used in conjunction with it.

Spatial effects

The available models for the spatial effect are splines and AR1. splines uses a two-dimensional tensor product of B-splines to represent the smooth spatially structured effect (Cappa and Cantet, 2007). AR1 uses a kronecker product of autoregressive models for the rows and columns (Dutkowski et al., 2002).

In both cases, the minimum necessary components in the list are

model a string, either splines or AR
coord a matrix, list or data.frame with two columns for the rows and columns respectively.

Optional common components are

var.ini a positive initial value for the variance component

Finally, optional model-dependent components are

For model splines
- n.knots a vector of two integers with the number of internal knots for the rows and columns of the spline design
For model AR
- rho a vector of two numbers strictly between -1 and 1 with the autoregressive parameters for the rows and columns respectively. Alternatively, a matrix or data.frame with two columns where every row contain a combination of autoregressive parameters to be tried.

The internal knots cover the region with observations at regular intervals (in each dimension). For the splines design, three additional knots are automatically added before the first internal knot, and other tree after the last one, in each dimension. As a result, if n.knots = c(n1, n2), then the final number of parameters of the splines model is (n1 + 2)(n2 + 2).

If n.knots is omitted, a sensible number of knots for each dimension is computed based on the number of observations in the experiment. See the internal function determine.n.knots. This is the default function that computes the default number of knots, but you can provide an alternative default function through breedR.setOption.

Due to limitations of the REML backend, we can only fit models with fixed autoregressive parameters. remlf90() will fit as many models as rows in rho, and return the results of the most likely. It will also return the list of log-likelihoods for each tried combination of autoregressive parameters, in the component rho of the reml object. This is useful for visualization, and further refinement of the search for appropriate parameters.

If any of the values in either column of rho is NA, then a default set of values for the corresponding dimension will be set. See breedR.getOption('ar.eval'), for the current defaults. You can set your own defaults with breedR.setOption. Each will be combined with every other value in the other column.

Omitting the specification of rho is equivalent to rho = c(NA, NA).

Intercept

An intercept is automatically introduced in the model provided the user doesn't explicitly prevents it by using 0 or -1 in the fixed formula (as conventional in R), and there are no other categorical covariates in fixed. The latter condition is actually a limitation of (ai)remlf90 backends, which would in any case return an estimate for each level of the categorical covariates while returning 0 for the intercept. It does not allow alternative parameterizations.

Initial variances

Initial variance components can also be specified through an additional argument var.ini. You can either use default initial values for the variance components (see ?breedR.options) or specify custom values for each and all variance components in the model. In this case, var.ini must be a named list with one element for each term in random with matching names, plus one last element named residual for the initial residual variance. Furthermore if there are genetic or spatial effects, they must as well include a numeric element var.ini with the initial variance component specification for the corresponding effect.

Inference method

AI-REML is usually faster than EM-REML, and it provides more results. Namely, standard errors of the variance components estimates, and covariances as well. On the other hand, is less robust than EM-REML and it usually gives extreme results when used with the splines spatial model (as in spatial = list(model ='splines', ...)).

Even when an effect accounts for no variance at all, EM-REML will always estimate a positive variance which will be determined by the starting value. If AI-REML does not converge but EM-REML does with the same dataset and model, re-run EM-REML with a small starting value for the effect. If the estimate does not change, it is likely that there is no variance.

Remote computing

If breedR.bin = 'remote', the REML program will be run remotely and the results will be automatically transferred back automatically. While if breedR.bin = 'submit' the job will be submitted to the server, and the job-id and other relevant information about the model will be returned instantly. The returned object can be used to retrieve the results or check the status of the job. Several jobs can be submitted in parallel. See ?remote to learn how to configure breedR for remote computing, and how to manage submitted jobs.

Value

An object of class 'remlf90' that can be further questioned by fixef, ranef, fitted, etc.

References

progsf90 wiki page: http://nce.ads.uga.edu/wiki/doku.php

E. P. Cappa and R. J. C. Cantet (2007). Bayesian estimation of a surface to account for a spatial trend using penalized splines in an individual-tree mixed model. Canadian Journal of Forest Research 37(12):2677-2688.

G. W. Dutkowski, J. Costa e Silva, A. R. Gilmour, G. A. López (2002). Spatial analysis methods for forest genetic trials. Canadian Journal of Forest Research 32(12):2201-2214.

Examples

## Linear model
n <- 1e3
dat <- transform(data.frame(x = runif(n)),
                 y = 1 + 2*x + rnorm(n, sd = sqrt(3)))
res.lm <- remlf90(fixed = y ~ x, data = dat)
summary(res.lm)

## Linear Mixed model
f3 = factor(sample(letters[1:3], n, replace = TRUE))
dat <- transform(dat,
                 f3 = f3,
                 y = y + (-1:1)[f3])
res.lmm <- remlf90(fixed  = y ~ x,
                   random = ~ f3,
                   data   = dat)

## Generic model (used to manually fit the previous model)
inc.mat <- model.matrix(~ 0 + f3, dat)
cov.mat <- diag(3)
res.lmm2 <- remlf90(fixed  = y ~ x,
                    generic = list(f3 = list(inc.mat,
                                             cov.mat)),
                    data   = dat)
all.equal(res.lmm, res.lmm2, check.attributes = FALSE)  # TRUE
               
## Animal model
ped <- build_pedigree(c('self', 'dad', 'mum'),
                      data = as.data.frame(m1))
res.am <- remlf90(fixed   = phe_X ~ sex,
                  genetic = list(model    = 'add_animal',
                                 pedigree = ped,
                                 id       = 'self'),
                  data    = as.data.frame(m1))
                  
## Not run: 
## Same model with specification of initial variances
res.am <- remlf90(fixed   = phe_X ~ sex,
                  genetic = list(model    = 'add_animal',
                                 pedigree = ped,
                                 id       = 'self',
                                 var.ini  = 1),
                  data    = as.data.frame(m1),
                  var.ini = list(resid = 1))
 
## Animal-spatial models
gen.globulus <- list(model    = 'add_animal',
                     pedigree = globulus[, 1:3],
                     id       = 'self')
res.bm <- remlf90(fixed   = phe_X ~ gg,
                  genetic = gen.globulus,
                  spatial = list(model = 'blocks', 
                                 coord = globulus[, c('x','y')],
                                 id    = 'bl'),
                  data    = globulus)
                  
res.am <- remlf90(fixed   = phe_X ~ gg,
                  genetic = gen.globulus,
                  spatial = list(model = 'AR', 
                                 coord = globulus[, c('x','y')],
                                 rho   = c(.85, .8)),
                  data    = globulus)

res.sm <- remlf90(fixed   = phe_X ~ gg,
                  genetic = gen.globulus,
                  spatial = list(model = 'splines', 
                                 coord = globulus[, c('x','y')],
                                 n.knots = c(5, 5)),
                  data    = globulus,
                  method  = 'em')   # Necessary for splines models!!!


## Competition models

# This may take some minutes...
# and need to be fitted with 'em'
res.cm <- remlf90(fixed   = phe_X ~ 1,
                 genetic = list(model = 'competition',
                                pedigree = globulus[, 1:3],
                                id = 'self',
                                coord = globulus[, c('x','y')],
                                competition_decay = 1,
                                pec = list(present = TRUE)),
                 method = 'em',
                 data = globulus)

## End(Not run)

famuvie/breedR documentation built on Aug. 6, 2024, 9:10 p.m.

famuvie/breedR index

README.md Additive Genetic Models in Mixed Populations BreedR Overview General and Specific Combining Abilities Handling pedigrees

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

famuvie/breedR
Statistical Methods for Forest Genetic Resources Analysts

remlf90: Inference with REMLF90
In famuvie/breedR: Statistical Methods for Forest Genetic Resources Analysts

Inference with REMLF90

Description

Usage

Arguments

Details

Genetic effect

Spatial effects

Intercept

Initial variances

Inference method

Remote computing

Value

References

See Also

Examples

Related to remlf90 in famuvie/breedR...

R Package Documentation

Browse R Packages

We want your feedback!

famuvie/breedR Statistical Methods for Forest Genetic Resources Analysts

remlf90: Inference with REMLF90 In famuvie/breedR: Statistical Methods for Forest Genetic Resources Analysts

Inference with REMLF90

Description

Usage

Arguments

Details

Genetic effect

Spatial effects

Intercept

Initial variances

Inference method

Remote computing

Value

References

See Also

Examples

Related to remlf90 in famuvie/breedR...

R Package Documentation

Browse R Packages

We want your feedback!

famuvie/breedR
Statistical Methods for Forest Genetic Resources Analysts

remlf90: Inference with REMLF90
In famuvie/breedR: Statistical Methods for Forest Genetic Resources Analysts