View source: R/remlf90-class.R
remlf90 | R Documentation |
Fits a Linear Mixed Model by Restricted Maximum Likelihood
remlf90(
fixed,
random = NULL,
genetic = NULL,
spatial = NULL,
generic = NULL,
data,
var.ini = NULL,
method = c("ai", "em"),
breedR.bin = breedR.getOption("breedR.bin"),
progsf90.options = NULL,
weights = NULL,
debug = FALSE
)
fixed |
an object of class formula (or one that can be coerced to that class): a symbolic description of the fixed effects of the model to be fitted. The details of model specification are given under 'Details'. |
random |
if not |
genetic |
if not |
spatial |
if not |
generic |
if not |
data |
a data frame with variables and observations |
var.ini |
if not |
method |
either 'ai' or 'em' for Average-Information or Expectation-Maximization REML respectively |
breedR.bin |
character. The local directory where the package binaries are stored, or any of 'remote' or 'submit' for remote computing. See 'Details'. |
progsf90.options |
character. Passed directly to OPTIONS field in
PROGSF90. See available options for
REMLF90
and for
AIREMLF90.
Option |
weights |
numeric. A vector of weights for the residual variance. |
debug |
logical. If |
If either genetic
or spatial
are not NULL
, the
model residuals are assumed to have an additive genetic effects or a
spatially structured random effect, respectively. In those cases,
genetic
and spatial
must be lists with named relevant
parameters.
The generic
component implements random effects with custom
incidence and covariance (or precision) matrices. There can be any number
of them, stored in a named list with custom unique names that will be used
to identify and label the results. Each effect in the list must have an
incidence
argument and either a covariance
or a
precision
matrix with conforming and suitable dimensions.
Optionally, an initial variance for the REML algorithm can be specified in
a third argument var.ini
.
The available models for the genetic effect
are add_animal
and competition
. add_animal
stands for
an additive animal model with a given pedigree. competition
includes
the direct additive genetic effect and also a competition
additive genetic effect possibly correlated with the former.
The minimum elements in the list of the genetic
component are:
model
a string, either add_animal
or
competition
pedigree
either an object of class
pedigree
(see build_pedigree
) or a data.frame
with exactly three columns identifying the individual, his father and his
mother respectively
id
either a vector of codes or the name of
the variable with the individual identifier in the data
Optional common components are:
var.ini
a positive initial value for the variance
component(s). For a competition
model, the same initial value is
used for both effects, with a negative initial correlation between them of
half this value.
Finally, for model competition
there are further mandatory and
optional elements:
mandatory elements
coord
a matrix,
list or data.frame with two columns for the rows and columns of the
observations, respectively. This element is necessary even if duplicated in
a spatial
component.
competition_decay
a positive
number. The intensity of competition is weighted by the distance according
to 1/d^\alpha
. This element specifies the exponent \alpha
to be
used. Typically 1
or 2
.
optional elements
pec
Permanent Environmental
Competition effect. If present, this must be a named list with elements
present
which is either TRUE
or FALSE
and (optionally)
var.ini
specifying the initial variance for this effect.
The Permanent Environmental Competition (pec
) effect is actually
non-genetic in nature. However, it was included as an option to the
(genetic) competition effect as it is usually used in conjunction with it.
The available models for the spatial effect
are splines
and AR1
. splines
uses a two-dimensional
tensor product of B-splines to represent the smooth spatially structured
effect (Cappa and Cantet, 2007). AR1
uses a kronecker product of
autoregressive models for the rows and columns (Dutkowski et al., 2002).
In both cases, the minimum necessary components in the list are
model
a string, either splines
or AR
coord
a matrix, list or data.frame with two columns for the rows and
columns respectively.
Optional common components are
var.ini
a positive
initial value for the variance component
Finally, optional model-dependent components are
For model
splines
n.knots
a vector of two integers with
the number of internal knots for the rows and columns of the spline
design
For model AR
rho
a vector of
two numbers strictly between -1 and 1 with the autoregressive parameters
for the rows and columns respectively. Alternatively, a matrix or
data.frame with two columns where every row contain a combination of
autoregressive parameters to be tried.
The internal knots cover the region with observations at regular
intervals (in each dimension). For the splines design, three additional
knots are automatically added before the first internal knot, and other
tree after the last one, in each dimension. As a result, if n.knots =
c(n1, n2)
, then the final number of parameters of the splines model is
(n1 + 2)(n2 + 2)
.
If n.knots
is omitted, a sensible number of knots for each dimension
is computed based on the number of observations in the experiment. See the
internal function determine.n.knots
. This is the default
function that computes the default number of knots, but you can provide an
alternative default function through breedR.setOption
.
Due to limitations of the REML backend, we can only fit models with
fixed autoregressive parameters. remlf90()
will fit as many
models as rows in rho
, and return the results of the most likely. It
will also return the list of log-likelihoods for each tried combination of
autoregressive parameters, in the component rho
of the reml
object. This is useful for visualization, and further refinement of the
search for appropriate parameters.
If any of the values in either column of rho
is NA
, then a
default set of values for the corresponding dimension will be set. See
breedR.getOption('ar.eval')
, for the current defaults. You can set
your own defaults with breedR.setOption
. Each will be
combined with every other value in the other column.
Omitting the specification of rho
is equivalent to rho = c(NA,
NA)
.
An intercept is automatically introduced in the
model provided the user doesn't explicitly prevents it by using 0
or
-1
in the fixed
formula (as conventional in R
),
and there are no other categorical covariates in fixed
. The
latter condition is actually a limitation of (ai)remlf90 backends, which
would in any case return an estimate for each level of the categorical
covariates while returning 0 for the intercept. It does not allow
alternative parameterizations.
Initial variance components can also be
specified through an additional argument var.ini
. You can either use
default initial values for the variance components (see
?breedR.options
) or specify custom values for each and
all variance components in the model. In this case, var.ini
must be a named list with one element for each term in random
with
matching names, plus one last element named residual
for the initial
residual variance. Furthermore if there are genetic
or
spatial
effects, they must as well include a numeric element
var.ini
with the initial variance component specification for the
corresponding effect.
AI-REML is usually faster than EM-REML, and
it provides more results. Namely, standard errors of the variance
components estimates, and covariances as well. On the other hand, is less
robust than EM-REML and it usually gives extreme results when used with the
splines spatial model (as in spatial = list(model ='splines', ...)
).
Even when an effect accounts for no variance at all, EM-REML will always estimate a positive variance which will be determined by the starting value. If AI-REML does not converge but EM-REML does with the same dataset and model, re-run EM-REML with a small starting value for the effect. If the estimate does not change, it is likely that there is no variance.
If breedR.bin = 'remote'
, the REML
program will be run remotely and the results will be automatically
transferred back automatically. While if breedR.bin = 'submit'
the
job will be submitted to the server, and the job-id and other relevant
information about the model will be returned instantly. The returned object
can be used to retrieve the results or check the status of the job. Several
jobs can be submitted in parallel. See ?remote
to learn how to
configure breedR for remote computing, and how to manage submitted jobs.
An object of class 'remlf90' that can be further questioned by
fixef
, ranef
, fitted
, etc.
progsf90 wiki page: http://nce.ads.uga.edu/wiki/doku.php
E. P. Cappa and R. J. C. Cantet (2007). Bayesian estimation of a surface to account for a spatial trend using penalized splines in an individual-tree mixed model. Canadian Journal of Forest Research 37(12):2677-2688.
G. W. Dutkowski, J. Costa e Silva, A. R. Gilmour, G. A. López (2002). Spatial analysis methods for forest genetic trials. Canadian Journal of Forest Research 32(12):2201-2214.
pedigree
## Linear model
n <- 1e3
dat <- transform(data.frame(x = runif(n)),
y = 1 + 2*x + rnorm(n, sd = sqrt(3)))
res.lm <- remlf90(fixed = y ~ x, data = dat)
summary(res.lm)
## Linear Mixed model
f3 = factor(sample(letters[1:3], n, replace = TRUE))
dat <- transform(dat,
f3 = f3,
y = y + (-1:1)[f3])
res.lmm <- remlf90(fixed = y ~ x,
random = ~ f3,
data = dat)
## Generic model (used to manually fit the previous model)
inc.mat <- model.matrix(~ 0 + f3, dat)
cov.mat <- diag(3)
res.lmm2 <- remlf90(fixed = y ~ x,
generic = list(f3 = list(inc.mat,
cov.mat)),
data = dat)
all.equal(res.lmm, res.lmm2, check.attributes = FALSE) # TRUE
## Animal model
ped <- build_pedigree(c('self', 'dad', 'mum'),
data = as.data.frame(m1))
res.am <- remlf90(fixed = phe_X ~ sex,
genetic = list(model = 'add_animal',
pedigree = ped,
id = 'self'),
data = as.data.frame(m1))
## Not run:
## Same model with specification of initial variances
res.am <- remlf90(fixed = phe_X ~ sex,
genetic = list(model = 'add_animal',
pedigree = ped,
id = 'self',
var.ini = 1),
data = as.data.frame(m1),
var.ini = list(resid = 1))
## Animal-spatial models
gen.globulus <- list(model = 'add_animal',
pedigree = globulus[, 1:3],
id = 'self')
res.bm <- remlf90(fixed = phe_X ~ gg,
genetic = gen.globulus,
spatial = list(model = 'blocks',
coord = globulus[, c('x','y')],
id = 'bl'),
data = globulus)
res.am <- remlf90(fixed = phe_X ~ gg,
genetic = gen.globulus,
spatial = list(model = 'AR',
coord = globulus[, c('x','y')],
rho = c(.85, .8)),
data = globulus)
res.sm <- remlf90(fixed = phe_X ~ gg,
genetic = gen.globulus,
spatial = list(model = 'splines',
coord = globulus[, c('x','y')],
n.knots = c(5, 5)),
data = globulus,
method = 'em') # Necessary for splines models!!!
## Competition models
# This may take some minutes...
# and need to be fitted with 'em'
res.cm <- remlf90(fixed = phe_X ~ 1,
genetic = list(model = 'competition',
pedigree = globulus[, 1:3],
id = 'self',
coord = globulus[, c('x','y')],
competition_decay = 1,
pec = list(present = TRUE)),
method = 'em',
data = globulus)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.