View source: R/qrrvglm.control.q
qrrvglm.control | R Documentation |
Algorithmic constants and parameters for a constrained quadratic
ordination (CQO), by fitting a quadratic reduced-rank vector
generalized linear model (QRR-VGLM), are set using this function.
It is the control function for cqo
.
qrrvglm.control(Rank = 1, Bestof = if (length(Cinit)) 1 else 10,
checkwz = TRUE, Cinit = NULL, Crow1positive = TRUE,
epsilon = 1.0e-06, EqualTolerances = NULL, eq.tolerances = TRUE,
Etamat.colmax = 10, FastAlgorithm = TRUE, GradientFunction = TRUE,
Hstep = 0.001, isd.latvar = rep_len(c(2, 1, rep_len(0.5, Rank)),
Rank), iKvector = 0.1, iShape = 0.1, ITolerances = NULL,
I.tolerances = FALSE, maxitl = 40, imethod = 1,
Maxit.optim = 250, MUXfactor = rep_len(7, Rank),
noRRR = ~ 1, Norrr = NA, optim.maxit = 20,
Parscale = if (I.tolerances) 0.001 else 1.0,
sd.Cinit = 0.02, SmallNo = 5.0e-13, trace = TRUE,
Use.Init.Poisson.QO = TRUE,
wzepsilon = .Machine$double.eps^0.75, ...)
In the following, R
is the Rank
,
M
is the number
of linear predictors,
and S
is the number of responses
(species).
Thus M=S
for binomial and Poisson responses, and
M=2S
for the negative binomial and
2-parameter gamma distributions.
Rank |
The numerical rank |
Bestof |
Integer. The best of |
checkwz |
logical indicating whether the
diagonal elements of
the working weight matrices should be checked
whether they are
sufficiently positive, i.e., greater
than |
Cinit |
Optional initial |
Crow1positive |
Logical vector of length |
epsilon |
Positive numeric. Used to test for convergence for GLMs fitted in C. Larger values mean a loosening of the convergence criterion. If an error code of 3 is reported, try increasing this value. |
eq.tolerances |
Logical indicating whether each (quadratic) predictor will
have equal tolerances. Having |
EqualTolerances |
Defunct argument.
Use |
Etamat.colmax |
Positive integer, no smaller than |
FastAlgorithm |
Logical.
Whether a new fast algorithm is to be used. The fast
algorithm results in a large speed increases
compared to Yee (2004).
Some details of the fast algorithm are found
in Appendix A of Yee (2006).
Setting |
GradientFunction |
Logical. Whether |
Hstep |
Positive value. Used as the step size in
the finite difference
approximation to the derivatives
by |
isd.latvar |
Initial standard deviations for the latent variables
(site scores).
Numeric, positive and of length |
iKvector , iShape |
Numeric, recycled to length |
I.tolerances |
Logical. If |
ITolerances |
Defunct argument.
Use |
maxitl |
Maximum number of times the optimizer is called or restarted. Most users should ignore this argument. |
imethod |
Method of initialization. A positive integer 1 or 2 or 3 etc.
depending on the VGAM family function.
Currently it is used for |
Maxit.optim |
Positive integer. Number of iterations given to the function
|
MUXfactor |
Multiplication factor for detecting large offset values.
Numeric,
positive and of length |
optim.maxit |
Positive integer. Number of times |
noRRR |
Formula giving terms that are not
to be included in the
reduced-rank regression (or formation of
the latent variables),
i.e., those belong to |
Norrr |
Defunct. Please use |
Parscale |
Numerical and positive-valued vector of length |
sd.Cinit |
Standard deviation of the initial values for the elements
of |
trace |
Logical indicating if output should be produced for
each iteration. The default is |
SmallNo |
Positive numeric between |
Use.Init.Poisson.QO |
Logical. If |
wzepsilon |
Small positive number used to test whether the diagonals of the working weight matrices are sufficiently positive. |
... |
Ignored at present. |
Recall that the central formula for CQO is
\eta = B_1^T x_1 + A \nu +
\sum_{m=1}^M (\nu^T D_m \nu) e_m
where x_1
is a vector
(usually just a 1 for an intercept),
x_2
is a vector of environmental variables,
\nu=C^T x_2
is
a R
-vector of latent variables, e_m
is
a vector of 0s but with a 1 in the m
th position.
QRR-VGLMs are an extension of RR-VGLMs and
allow for maximum
likelihood solutions to constrained
quadratic ordination (CQO) models.
Having I.tolerances = TRUE
means all the tolerance matrices
are the order-R
identity matrix, i.e., it forces
bell-shaped curves/surfaces on all species. This results in a
more difficult optimization problem (especially for 2-parameter
models such as the negative binomial and gamma) because of overflow
errors and it appears there are more local solutions. To help avoid
the overflow errors, scaling C
by the factor Parscale
can help enormously. Even better, scaling C
by specifying
isd.latvar
is more understandable to humans. If failure to
converge occurs, try adjusting Parscale
, or better, setting
eq.tolerances = TRUE
(and hope that the estimated tolerance
matrix is positive-definite). To fit an equal-tolerances model, it
is firstly best to try setting I.tolerances = TRUE
and varying
isd.latvar
and/or MUXfactor
if it fails to converge.
If it still fails to converge after many attempts, try setting
eq.tolerances = TRUE
, however this will usually be a lot slower
because it requires a lot more memory.
With a R > 1
model, the latent variables are always uncorrelated,
i.e., the variance-covariance matrix of the site scores is a diagonal
matrix.
If setting eq.tolerances = TRUE
is
used and the common
estimated tolerance matrix is positive-definite
then that model is
effectively the same as the I.tolerances = TRUE
model (the two are
transformations of each other).
In general, I.tolerances = TRUE
is numerically more unstable and presents
a more difficult problem
to optimize; the arguments isd.latvar
and/or MUXfactor
often
must be assigned some good value(s)
(possibly found by trial and error)
in order for convergence to occur.
Setting I.tolerances = TRUE
forces a bell-shaped curve or surface
onto all the species data,
therefore this option should be used with
deliberation. If unsuitable,
the resulting fit may be very misleading.
Usually it is a good idea
for the user to set eq.tolerances = FALSE
to see which species
appear to have a bell-shaped curve or surface.
Improvements to the
fit can often be achieved using transformations,
e.g., nitrogen
concentration to log nitrogen concentration.
Fitting a CAO model (see cao
)
first is a good idea for
pre-examining the data and checking whether
it is appropriate to fit
a CQO model.
A list with components matching the input names.
The default value of Bestof
is a bare minimum
for many datasets,
therefore it will be necessary to increase its
value to increase the
chances of obtaining the global solution.
When I.tolerances = TRUE
it is a good idea to apply
scale
to all
the numerical variables that make up
the latent variable, i.e., those of x_2
.
This is to make
them have mean 0, and hence avoid large offset
values which cause
numerical problems.
This function has many arguments that are common with
rrvglm.control
and vglm.control
.
It is usually a good idea to try fitting a model with
I.tolerances = TRUE
first, and
if convergence is unsuccessful,
then try eq.tolerances = TRUE
and I.tolerances = FALSE
.
Ordination diagrams with
eq.tolerances = TRUE
have a natural
interpretation, but
with eq.tolerances = FALSE
they are
more complicated and
requires, e.g., contours to be overlaid on
the ordination diagram
(see lvplot.qrrvglm
).
In the example below, an equal-tolerances CQO model
is fitted to the
hunting spiders data.
Because I.tolerances = TRUE
, it is a good idea
to center all the x_2
variables first.
Upon fitting the model,
the actual standard deviation of the site scores
are computed. Ideally,
the isd.latvar
argument should have had
this value for the best
chances of getting good initial values.
For comparison, the model is
refitted with that value and it should
run more faster and reliably.
Thomas W. Yee
Yee, T. W. (2004). A new technique for maximum-likelihood canonical Gaussian ordination. Ecological Monographs, 74, 685–701.
Yee, T. W. (2006). Constrained additive ordination. Ecology, 87, 203–213.
cqo
,
rcqo
,
Coef.qrrvglm
,
Coef.qrrvglm-class
,
optim
,
binomialff
,
poissonff
,
negbinomial
,
gamma2
.
## Not run: # Poisson CQO with equal tolerances
set.seed(111) # This leads to the global solution
hspider[,1:6] <- scale(hspider[,1:6]) # Good when I.tolerances = TRUE
p1 <- cqo(cbind(Alopacce, Alopcune, Alopfabr,
Arctlute, Arctperi, Auloalbi,
Pardlugu, Pardmont, Pardnigr,
Pardpull, Trocterr, Zoraspin) ~
WaterCon + BareSand + FallTwig +
CoveMoss + CoveHerb + ReflLux,
poissonff, data = hspider, eq.tolerances = TRUE)
sort(deviance(p1, history = TRUE)) # Iteration history
(isd.latvar <- apply(latvar(p1), 2, sd)) # Approx isd.latvar
# Refit the model with better initial values
set.seed(111) # This leads to the global solution
p1 <- cqo(cbind(Alopacce, Alopcune, Alopfabr,
Arctlute, Arctperi, Auloalbi,
Pardlugu, Pardmont, Pardnigr,
Pardpull, Trocterr, Zoraspin) ~
WaterCon + BareSand + FallTwig +
CoveMoss + CoveHerb + ReflLux,
I.tolerances = TRUE, poissonff, data = hspider,
isd.latvar = isd.latvar) # Note this
sort(deviance(p1, history = TRUE)) # Iteration history
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.