remlf90: Inference with REMLF90

Description Usage Arguments Details Value References See Also Examples

View source: R/remlf90-class.R

Description

Fits a Linear Mixed Model by Restricted Maximum Likelihood

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
remlf90(
  fixed,
  random = NULL,
  genetic = NULL,
  spatial = NULL,
  generic = NULL,
  data,
  var.ini = NULL,
  method = c("ai", "em"),
  breedR.bin = breedR.getOption("breedR.bin"),
  progsf90.options = NULL,
  weights = NULL,
  debug = FALSE
)

Arguments

fixed

an object of class formula (or one that can be coerced to that class): a symbolic description of the fixed effects of the model to be fitted. The details of model specification are given under 'Details'.

random

if not NULL, an object of class formula with the unstructured random effects.

genetic

if not NULL, a list with relevant parameters for an additive genetic effect; see 'Details'.

spatial

if not NULL, a list with relevant parameters for a spatial random effect; see 'Details'.

generic

if not NULL, a named list with an incidence matrix and either a covariance or a precision matrix; see 'Details'.

data

a data frame with variables and observations

var.ini

if not NULL, a named list with one item for each random component in the model. See 'Details'.

method

either 'ai' or 'em' for Average-Information or Expectation-Maximization REML respectively

breedR.bin

character. The local directory where the package binaries are stored, or any of 'remote' or 'submit' for remote computing. See 'Details'.

progsf90.options

character. Passed directly to OPTIONS field in PROGSF90. See available options for REMLF90 and for AIREMLF90. Option sol se is passed always and cannot be removed. No checks are performed, handle with care.

weights

numeric. A vector of weights for the residual variance.

debug

logical. If TRUE, the input files for blupf90 programs and their output are shown, but results are not parsed.

Details

If either genetic or spatial are not NULL, the model residuals are assumed to have an additive genetic effects or a spatially structured random effect, respectively. In those cases, genetic and spatial must be lists with named relevant parameters.

The generic component implements random effects with custom incidence and covariance (or precision) matrices. There can be any number of them, stored in a named list with custom unique names that will be used to identify and label the results. Each effect in the list must have an incidence argument and either a covariance or a precision matrix with conforming and suitable dimensions. Optionally, an initial variance for the REML algorithm can be specified in a third argument var.ini.

Genetic effect

The available models for the genetic effect are add_animal and competition. add_animal stands for an additive animal model with a given pedigree. competition includes the direct additive genetic effect and also a competition additive genetic effect possibly correlated with the former.

The minimum elements in the list of the genetic component are:

Optional common components are:

Finally, for model competition there are further mandatory and optional elements:

The Permanent Environmental Competition (pec) effect is actually non-genetic in nature. However, it was included as an option to the (genetic) competition effect as it is usually used in conjunction with it.

Spatial effects

The available models for the spatial effect are splines and AR1. splines uses a two-dimensional tensor product of B-splines to represent the smooth spatially structured effect (Cappa and Cantet, 2007). AR1 uses a kronecker product of autoregressive models for the rows and columns (Dutkowski et al., 2002).

In both cases, the minimum necessary components in the list are

Optional common components are

Finally, optional model-dependent components are

The internal knots cover the region with observations at regular intervals (in each dimension). For the splines design, three additional knots are automatically added before the first internal knot, and other tree after the last one, in each dimension. As a result, if n.knots = c(n1, n2), then the final number of parameters of the splines model is (n1 + 2)(n2 + 2).

If n.knots is omitted, a sensible number of knots for each dimension is computed based on the number of observations in the experiment. See the internal function determine.n.knots. This is the default function that computes the default number of knots, but you can provide an alternative default function through breedR.setOption.

Due to limitations of the REML backend, we can only fit models with fixed autoregressive parameters. remlf90() will fit as many models as rows in rho, and return the results of the most likely. It will also return the list of log-likelihoods for each tried combination of autoregressive parameters, in the component rho of the reml object. This is useful for visualization, and further refinement of the search for appropriate parameters.

If any of the values in either column of rho is NA, then a default set of values for the corresponding dimension will be set. See breedR.getOption('ar.eval'), for the current defaults. You can set your own defaults with breedR.setOption. Each will be combined with every other value in the other column.

Omitting the specification of rho is equivalent to rho = c(NA, NA).

Intercept

An intercept is automatically introduced in the model provided the user doesn't explicitly prevents it by using 0 or -1 in the fixed formula (as conventional in R), and there are no other categorical covariates in fixed. The latter condition is actually a limitation of (ai)remlf90 backends, which would in any case return an estimate for each level of the categorical covariates while returning 0 for the intercept. It does not allow alternative parameterizations.

Initial variances

Initial variance components can also be specified through an additional argument var.ini. You can either use default initial values for the variance components (see ?breedR.options) or specify custom values for each and all variance components in the model. In this case, var.ini must be a named list with one element for each term in random with matching names, plus one last element named residual for the initial residual variance. Furthermore if there are genetic or spatial effects, they must as well include a numeric element var.ini with the initial variance component specification for the corresponding effect.

Inference method

AI-REML is usually faster than EM-REML, and it provides more results. Namely, standard errors of the variance components estimates, and covariances as well. On the other hand, is less robust than EM-REML and it usually gives extreme results when used with the splines spatial model (as in spatial = list(model ='splines', ...)).

Even when an effect accounts for no variance at all, EM-REML will always estimate a positive variance which will be determined by the starting value. If AI-REML does not converge but EM-REML does with the same dataset and model, re-run EM-REML with a small starting value for the effect. If the estimate does not change, it is likely that there is no variance.

Remote computing

If breedR.bin = 'remote', the REML program will be run remotely and the results will be automatically transferred back automatically. While if breedR.bin = 'submit' the job will be submitted to the server, and the job-id and other relevant information about the model will be returned instantly. The returned object can be used to retrieve the results or check the status of the job. Several jobs can be submitted in parallel. See ?remote to learn how to configure breedR for remote computing, and how to manage submitted jobs.

Value

An object of class 'remlf90' that can be further questioned by fixef, ranef, fitted, etc.

References

progsf90 wiki page: http://nce.ads.uga.edu/wiki/doku.php

E. P. Cappa and R. J. C. Cantet (2007). Bayesian estimation of a surface to account for a spatial trend using penalized splines in an individual-tree mixed model. Canadian Journal of Forest Research 37(12):2677-2688.

G. W. Dutkowski, J. Costa e Silva, A. R. Gilmour, G. A. López (2002). Spatial analysis methods for forest genetic trials. Canadian Journal of Forest Research 32(12):2201-2214.

See Also

pedigree

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
## Linear model
n <- 1e3
dat <- transform(data.frame(x = runif(n)),
                 y = 1 + 2*x + rnorm(n, sd = sqrt(3)))
res.lm <- remlf90(fixed = y ~ x, data = dat)
summary(res.lm)

## Linear Mixed model
f3 = factor(sample(letters[1:3], n, replace = TRUE))
dat <- transform(dat,
                 f3 = f3,
                 y = y + (-1:1)[f3])
res.lmm <- remlf90(fixed  = y ~ x,
                   random = ~ f3,
                   data   = dat)

## Generic model (used to manually fit the previous model)
inc.mat <- model.matrix(~ 0 + f3, dat)
cov.mat <- diag(3)
res.lmm2 <- remlf90(fixed  = y ~ x,
                    generic = list(f3 = list(inc.mat,
                                             cov.mat)),
                    data   = dat)
all.equal(res.lmm, res.lmm2, check.attributes = FALSE)  # TRUE
               
## Animal model
ped <- build_pedigree(c('self', 'dad', 'mum'),
                      data = as.data.frame(m1))
res.am <- remlf90(fixed   = phe_X ~ sex,
                  genetic = list(model    = 'add_animal',
                                 pedigree = ped,
                                 id       = 'self'),
                  data    = as.data.frame(m1))
                  
## Not run: 
## Same model with specification of initial variances
res.am <- remlf90(fixed   = phe_X ~ sex,
                  genetic = list(model    = 'add_animal',
                                 pedigree = ped,
                                 id       = 'self',
                                 var.ini  = 1),
                  data    = as.data.frame(m1),
                  var.ini = list(resid = 1))
 
## Animal-spatial models
gen.globulus <- list(model    = 'add_animal',
                     pedigree = globulus[, 1:3],
                     id       = 'self')
res.bm <- remlf90(fixed   = phe_X ~ gg,
                  genetic = gen.globulus,
                  spatial = list(model = 'blocks', 
                                 coord = globulus[, c('x','y')],
                                 id    = 'bl'),
                  data    = globulus)
                  
res.am <- remlf90(fixed   = phe_X ~ gg,
                  genetic = gen.globulus,
                  spatial = list(model = 'AR', 
                                 coord = globulus[, c('x','y')],
                                 rho   = c(.85, .8)),
                  data    = globulus)

res.sm <- remlf90(fixed   = phe_X ~ gg,
                  genetic = gen.globulus,
                  spatial = list(model = 'splines', 
                                 coord = globulus[, c('x','y')],
                                 n.knots = c(5, 5)),
                  data    = globulus,
                  method  = 'em')   # Necessary for splines models!!!


## Competition models

# This may take some minutes...
# and need to be fitted with 'em'
res.cm <- remlf90(fixed   = phe_X ~ 1,
                 genetic = list(model = 'competition',
                                pedigree = globulus[, 1:3],
                                id = 'self',
                                coord = globulus[, c('x','y')],
                                competition_decay = 1,
                                pec = list(present = TRUE)),
                 method = 'em',
                 data = globulus)

## End(Not run)

famuvie/breedR documentation built on Sept. 6, 2021, 4:50 a.m.