anlvm.fit | R Documentation |
Fits an Auxiliary Nonlinear Variance Model (ANLVM) to estimate the error variances of a heteroskedastic linear regression model.
anlvm.fit(
mainlm,
g,
M = NULL,
cluster = FALSE,
varselect = c("none", "hettest", "cv.linear", "cv.cluster", "qgcv.linear",
"qgcv.cluster"),
nclust = c("elbow.swd", "elbow.mwd", "elbow.both"),
clustering = NULL,
param.init = function(q) stats::runif(n = q, min = -5, max = 5),
maxgridrows = 20L,
nconvstop = 3L,
zerosallowed = FALSE,
maxitql = 100L,
tolql = 1e-08,
nestedql = FALSE,
reduce2homosked = TRUE,
cvoption = c("testsetols", "partitionres"),
nfolds = 5L,
...
)
mainlm |
Either an object of |
g |
A numeric-valued function of one variable, or a character denoting
the name of such a function. |
M |
An |
cluster |
A logical; should the design matrix X be replaced with an
|
varselect |
Either a character indicating how variable selection should
be conducted, or an integer vector giving indices of columns of the
predictor matrix (
|
nclust |
A character indicating which elbow method to use to select
the number of clusters (ignored if |
clustering |
A list object of class |
param.init |
Specifies the initial values of the parameter vector to
use in the Gauss-Newton fitting algorithm. This can either be a function
for generating the initial values from a probability distribution, a
list containing named objects corresponding to the arguments of
|
maxgridrows |
An integer indicating the maximum number of initial
values of the parameter vector to try, in case of |
nconvstop |
An integer indicating how many times the quasi-likelihood
estimation algorithm should converge before the grid search across
different initial parameter values is truncated. Defaults to |
zerosallowed |
A logical indicating whether 0 values are acceptable
in the initial values of the parameter vector. Defaults to |
maxitql |
An integer specifying the maximum number of iterations to
run in the Gauss-Newton algorithm for quasi-likelihood estimation.
Defaults to |
tolql |
A double specifying the convergence criterion for the
Gauss-Newton algorithm; defaults to |
nestedql |
A logical indicating whether to use the nested updating step
suggested in \insertCiteSeber03;textualskedastic. Defaults to
|
reduce2homosked |
A logical indicating whether the homoskedastic
error variance estimator |
cvoption |
A character, either |
nfolds |
An integer specifying the number of folds |
... |
Other arguments that can be passed to (non-exported) helper functions, namely:
|
The ANLVM model equation is
e_i^2=\displaystyle\sum_{k=1}^{n} g(X_{k\cdot}'\gamma) m_{ik}^2+u_i
,
where e_i
is the i
th Ordinary Least Squares residual,
X_{k\cdot}
is a vector corresponding to the k
th row of the
n\times p
design matrix X
, m_{ik}^2
is the
(i,k)
th element of the annihilator matrix M=I-X(X'X)^{-1}X'
,
u_i
is a random error term, \gamma
is a p
-vector of
unknown parameters, and g(\cdot)
is a continuous, differentiable
function that need not be linear in \gamma
, but must be expressible
as a function of the linear predictor X_{k\cdot}'\gamma
.
This method has been developed as part of the author's doctoral research
project.
The parameter vector \gamma
is estimated using the maximum
quasi-likelihood method as described in section 2.3 of
\insertCiteSeber03;textualskedastic. The optimisation problem is
solved numerically using a Gauss-Newton algorithm.
For further discussion of feature selection and the methods for choosing the
number of clusters to use with the clustering version of the model, see
alvm.fit
.
An object of class "anlvm.fit"
, containing the following:
coef.est
, a vector of parameter estimates, \hat{\gamma}
var.est
, a vector of estimates \hat{\omega}
of the error
variances for all observations
method
, either "cluster"
or "functionalform"
,
depending on whether cluster
was set to TRUE
ols
, the lm
object corresponding to the original linear
regression model
fitinfo
, a list containing three named objects, g
(the
heteroskedastic function), Msq
(the elementwise-square of the
annihilator matrix M
), Z
(the design matrix used in the
ANLVM, after feature selection if applicable), and clustering
(a list object with results of the clustering procedure, if applicable).
selectinfo
, a list containing two named objects,
varselect
(the value of the eponymous argument), and
selectedcols
(a numeric vector with column indices of X
that were selected, with 1
denoting the intercept column)
qlinfo
, a list containing nine named objects: converged
(a logical, indicating whether the Gauss-Newton algorithm converged
for at least one initial value of the parameter vector),
iterations
(the number of Gauss-Newton iterations used to
obtain the parameter estimates returned), Smin
(the minimum
achieved value of the objective function used in the Gauss-Newton
routine), and six arguments passed to the function (nested
,
param.init
, maxgridrows
, nconvstop
,
maxitql
, and tolql
)
alvm.fit
, avm.ci
mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
myanlvm <- anlvm.fit(mtcars_lm, g = function(x) x ^ 2,
varselect = "qgcv.linear")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.