alvm.fit | R Documentation |
Fits an Auxiliary Linear Variance Model (ALVM) to estimate the error variances of a heteroskedastic linear regression model.
alvm.fit(
mainlm,
M = NULL,
model = c("cluster", "spline", "linear", "polynomial", "basic", "homoskedastic"),
varselect = c("none", "hettest", "cv.linear", "cv.cluster", "qgcv.linear",
"qgcv.cluster"),
lambda = c("foldcv", "qgcv"),
nclust = c("elbow.swd", "elbow.mwd", "elbow.both", "foldcv"),
clustering = NULL,
polypen = c("L2", "L1"),
d = 2L,
solver = c("auto", "quadprog", "quadprogXT", "roi", "osqp"),
tsk = NULL,
tsm = NULL,
constol = 1e-10,
cvoption = c("testsetols", "partitionres"),
nfolds = 5L,
reduce2homosked = TRUE,
...
)
mainlm |
Either an object of |
M |
An |
model |
A character corresponding to the type of ALVM to be fitted:
|
varselect |
Either a character indicating how variable selection should
be conducted, or an integer vector giving indices of columns of the
predictor matrix (
|
lambda |
Either a double of length 1 indicating the value of the
penalty hyperparameter |
nclust |
Either an integer of length 1 indicating the value of the
number of clusters |
clustering |
A list object of class |
polypen |
A character, either |
d |
An integer specifying the degree of polynomial to use in the
penalised polynomial ALVM; defaults to |
solver |
A character, indicating which Quadratic Programming solver
function to use to estimate |
tsk |
An integer corresponding to the basis dimension |
tsm |
An integer corresponding to the order |
constol |
A double corresponding to the boundary value for the
constraint on error variances. Of course, the error variances must be
non-negative, but setting the constraint boundary to 0 can result in
zero estimates that then result in infinite weights for Feasible
Weighted Least Squares. The boundary value should thus be positive, but
small enough not to bias estimation of very small variances. Defaults to
|
cvoption |
A character, either |
nfolds |
An integer specifying the number of folds |
reduce2homosked |
A logical indicating whether the homoskedastic
error variance estimator |
... |
Other arguments that can be passed to (non-exported) helper functions, namely:
|
The ALVM model equation is
e\circ e = (M \circ M)L \gamma + u
,
where e
is the Ordinary Least Squares residual vector, M
is
the annihilator matrix M=I-X(X'X)^{-1}X'
, L
is a linear
predictor matrix, u
is a random error vector, \gamma
is a
p
-vector of unknown parameters, and \circ
denotes the
Hadamard (elementwise) product. The construction of L
depends on
the method used to model or estimate the assumed heteroskedastic
function g(\cdot)
, a continuous, differentiable function that is
linear in \gamma
and by which the error variances \omega_i
of the main linear model are related to the predictors X_{i\cdot}
.
This method has been developed as part of the author's doctoral research
project.
Depending on the model used, the estimation method could be Inequality-Constrained Least Squares or Inequality-Constrained Ridge Regression. However, these are both special cases of Quadratic Programming. Therefore, all of the models are fitted using Quadratic Programming.
Several techniques are available for feature selection within the model.
The LASSO-type model handles feature selection via a shrinkage penalty.
For this reason, if the user calls the polynomial model with
L_1
-norm penalty, it is not necessary to specify a variable
selection method, since this is handled automatically. Another feature
selection technique is to use a heteroskedasticity test that tests for
heteroskedasticity linked to a particular predictor variable (the
‘deflator’). This test can be conducted with each features in turn
serving as the deflator. Those features for which the null hypothesis of
homoskedasticity is rejected at a specified significance level
alpha
are selected. A third feature selection technique is best
subset selection, where the model is fitted with all possible subsets of
features. The models are scored in terms of some metric, and the
best-performing subset of features is selected. The metric could be
squared-error loss computed under K
-fold cross-validation or using
quasi-generalised cross-validation. (The quasi- prefix refers to
the fact that generalised cross-validation is, properly speaking, only
applicable to a linear fitting method, as defined by
\insertCiteHastie09;textualskedastic. ALVMs are not linear fitting
methods due to the inequality constraint). Since best subset selection
requires fitting 2^{p-1}
models (where p-1
is the number of
candidate features), it is infeasible for large p
. A greedy search
technique can therefore be used as an alternative, where one begins with
a null model and adds the feature that leads to the best improvement in
the metric, stopping when no new feature leads to an improvement.
The polynomial and thin-plate spline ALVMs have a penalty hyperparameter
\lambda
that must either be specified or tuned. K
-fold
cross-validation or quasi-generalised cross-validation can be used for
tuning. The clustering ALVM has a hyperparameter n_c
, the number of
clusters into which to group the observations (where error variances
are assumed to be equal within each cluster). n_c
can be specified
or tuned. The available tuning methods are an elbow method (using a
sum of within-cluster distances criterion, a maximum
within-cluster distance criterion, or a combination of the two) and
K
-fold cross-validation.
An object of class "alvm.fit"
, containing the following:
coef.est
, a vector of parameter estimates, \hat{\gamma}
var.est
, a vector of estimates \hat{\omega}
of the error
variances for all observations
method
, a character corresponding to the model
argument
ols
, the lm
object corresponding to the original linear
regression model
fitinfo
, a list containing four named objects: Msq
(the
elementwise-square of the annihilator matrix M
), L
(the
linear predictor matrix L
), clustering
(a list object
with results of the clustering procedure), and gam.object
, an
object of class "gam"
(see gamObject
). The
last two are set to NA
unless the clustering ALVM or thin-plate
spline ALVM is used, respectively
hyperpar
, a named list of hyperparameter values,
lambda
, nclust
, tsk
, and d
, and tuning
methods, lambdamethod
and nclustmethod
. Values
corresponding to unused hyperparameters are set to NA
.
selectinfo
, a list containing two named objects,
varselect
(the value of the eponymous argument), and
selectedcols
(a numeric vector with column indices of X
that were selected, with 1
denoting the intercept column)
pentype
, a character corresponding to the polypen
argument
solver
, a character corresponding to the solver
argument (or specifying the QP solver actually used, if solver
was set to "auto"
)
constol
, a double corresponding to the constol
argument
alvm.fit
, avm.ci
mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
myalvm <- alvm.fit(mtcars_lm, model = "cluster")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.