| anlvm.fit | R Documentation |
Fits an Auxiliary Nonlinear Variance Model (ANLVM) to estimate the error variances of a heteroskedastic linear regression model.
anlvm.fit(
mainlm,
g,
M = NULL,
cluster = FALSE,
varselect = c("none", "hettest", "cv.linear", "cv.cluster", "qgcv.linear",
"qgcv.cluster"),
nclust = c("elbow.swd", "elbow.mwd", "elbow.both"),
clustering = NULL,
param.init = function(q) stats::runif(n = q, min = -5, max = 5),
maxgridrows = 20L,
nconvstop = 3L,
zerosallowed = FALSE,
maxitql = 100L,
tolql = 1e-08,
nestedql = FALSE,
reduce2homosked = TRUE,
cvoption = c("testsetols", "partitionres"),
nfolds = 5L,
...
)
mainlm |
Either an object of |
g |
A numeric-valued function of one variable, or a character denoting
the name of such a function. |
M |
An |
cluster |
A logical; should the design matrix X be replaced with an
|
varselect |
Either a character indicating how variable selection should
be conducted, or an integer vector giving indices of columns of the
predictor matrix (
|
nclust |
A character indicating which elbow method to use to select
the number of clusters (ignored if |
clustering |
A list object of class |
param.init |
Specifies the initial values of the parameter vector to
use in the Gauss-Newton fitting algorithm. This can either be a function
for generating the initial values from a probability distribution, a
list containing named objects corresponding to the arguments of
|
maxgridrows |
An integer indicating the maximum number of initial
values of the parameter vector to try, in case of |
nconvstop |
An integer indicating how many times the quasi-likelihood
estimation algorithm should converge before the grid search across
different initial parameter values is truncated. Defaults to |
zerosallowed |
A logical indicating whether 0 values are acceptable
in the initial values of the parameter vector. Defaults to |
maxitql |
An integer specifying the maximum number of iterations to
run in the Gauss-Newton algorithm for quasi-likelihood estimation.
Defaults to |
tolql |
A double specifying the convergence criterion for the
Gauss-Newton algorithm; defaults to |
nestedql |
A logical indicating whether to use the nested updating step
suggested in \insertCiteSeber03;textualskedastic. Defaults to
|
reduce2homosked |
A logical indicating whether the homoskedastic
error variance estimator |
cvoption |
A character, either |
nfolds |
An integer specifying the number of folds |
... |
Other arguments that can be passed to (non-exported) helper functions, namely:
|
The ANLVM model equation is
e_i^2=\displaystyle\sum_{k=1}^{n} g(X_{k\cdot}'\gamma) m_{ik}^2+u_i
,
where e_i is the ith Ordinary Least Squares residual,
X_{k\cdot} is a vector corresponding to the kth row of the
n\times p design matrix X, m_{ik}^2 is the
(i,k)th element of the annihilator matrix M=I-X(X'X)^{-1}X',
u_i is a random error term, \gamma is a p-vector of
unknown parameters, and g(\cdot) is a continuous, differentiable
function that need not be linear in \gamma, but must be expressible
as a function of the linear predictor X_{k\cdot}'\gamma.
This method has been developed as part of the author's doctoral research
project.
The parameter vector \gamma is estimated using the maximum
quasi-likelihood method as described in section 2.3 of
\insertCiteSeber03;textualskedastic. The optimisation problem is
solved numerically using a Gauss-Newton algorithm.
For further discussion of feature selection and the methods for choosing the
number of clusters to use with the clustering version of the model, see
alvm.fit.
An object of class "anlvm.fit", containing the following:
coef.est, a vector of parameter estimates, \hat{\gamma}
var.est, a vector of estimates \hat{\omega} of the error
variances for all observations
method, either "cluster" or "functionalform",
depending on whether cluster was set to TRUE
ols, the lm object corresponding to the original linear
regression model
fitinfo, a list containing three named objects, g (the
heteroskedastic function), Msq (the elementwise-square of the
annihilator matrix M), Z (the design matrix used in the
ANLVM, after feature selection if applicable), and clustering
(a list object with results of the clustering procedure, if applicable).
selectinfo, a list containing two named objects,
varselect (the value of the eponymous argument), and
selectedcols (a numeric vector with column indices of X
that were selected, with 1 denoting the intercept column)
qlinfo, a list containing nine named objects: converged
(a logical, indicating whether the Gauss-Newton algorithm converged
for at least one initial value of the parameter vector),
iterations (the number of Gauss-Newton iterations used to
obtain the parameter estimates returned), Smin (the minimum
achieved value of the objective function used in the Gauss-Newton
routine), and six arguments passed to the function (nested,
param.init, maxgridrows, nconvstop,
maxitql, and tolql)
alvm.fit, avm.ci
mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
myanlvm <- anlvm.fit(mtcars_lm, g = function(x) x ^ 2,
varselect = "qgcv.linear")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.