anlvm.fit: Auxiliary Nonlinear Variance Model

View source: R/anlvm.fit.R

anlvm.fitR Documentation

Auxiliary Nonlinear Variance Model

Description

Fits an Auxiliary Nonlinear Variance Model (ANLVM) to estimate the error variances of a heteroskedastic linear regression model.

Usage

anlvm.fit(
  mainlm,
  g,
  M = NULL,
  cluster = FALSE,
  varselect = c("none", "hettest", "cv.linear", "cv.cluster", "qgcv.linear",
    "qgcv.cluster"),
  nclust = c("elbow.swd", "elbow.mwd", "elbow.both"),
  clustering = NULL,
  param.init = function(q) stats::runif(n = q, min = -5, max = 5),
  maxgridrows = 20L,
  nconvstop = 3L,
  zerosallowed = FALSE,
  maxitql = 100L,
  tolql = 1e-08,
  nestedql = FALSE,
  reduce2homosked = TRUE,
  cvoption = c("testsetols", "partitionres"),
  nfolds = 5L,
  ...
)

Arguments

mainlm

Either an object of class "lm" (e.g., generated by lm), or a list of two objects: a response vector and a design matrix. The objects are assumed to be in that order, unless they are given the names "X" and "y" to distinguish them. The design matrix passed in a list must begin with a column of ones if an intercept is to be included in the linear model. The design matrix passed in a list should not contain factors, as all columns are treated 'as is'. For tests that use ordinary least squares residuals, one can also pass a vector of residuals in the list, which should either be the third object or be named "e".

g

A numeric-valued function of one variable, or a character denoting the name of such a function. "sq" is allowed as a way of denoting function(x) x ^ 2.

M

An n\times n annihilator matrix. If NULL (the default), this will be calculated from the mainlm object

cluster

A logical; should the design matrix X be replaced with an n\times n_c matrix of ones and zeroes, with a single one in each row, indicating assignments of the n observations to n_c clusters using an agglomerative hierarchical clustering algorithm. In this case, the dimensionality of \gamma is n_c and not p. Defaults to FALSE

varselect

Either a character indicating how variable selection should be conducted, or an integer vector giving indices of columns of the predictor matrix (model.matrix of mainlm) to select. The vector must include 1L for the intercept to be selected. If a character, it must be one of the following:

  • "none": No variable selection is conducted;

  • "hettest": Variable selection is conducted by applying a heteroskedasticity test with each feature in turn serving as the ‘deflator’ variable

  • "cv.linear": Variable selection is conducted by best subset selection on the auxiliary linear variance model (linear specification), using squared-error loss computed under K-fold cross-validation

  • "cv.cluster": Variable selection is conducted by best subset selection on the auxiliary linear variance model (clustering specification), using squared-error loss computed under K-fold cross-validation

  • "qgcv.linear": Variable selection is conducted by best subset selection on the auxiliary linear variance model (linear specification), using squared-error loss computed under quasi-generalised cross-validation

  • "qgcv.cluster": Variable selection is conducted by best subset selection on the auxiliary linear variance model (clustering specification), using squared-error loss computed under quasi-generalised cross-validation

nclust

A character indicating which elbow method to use to select the number of clusters (ignored if cluster is FALSE). Alternatively, an integer specifying the number of clusters

clustering

A list object of class "doclust". If set to NULL (the default), such an object is generated (ignored if cluster is FALSE)

param.init

Specifies the initial values of the parameter vector to use in the Gauss-Newton fitting algorithm. This can either be a function for generating the initial values from a probability distribution, a list containing named objects corresponding to the arguments of seq (specifying a sequence of scalar values that will be passed to expand.grid), or a numeric vector specifying a single initial parameter vector

maxgridrows

An integer indicating the maximum number of initial values of the parameter vector to try, in case of param.init being a function or a list used to generate a grid. Defaults to 20L.

nconvstop

An integer indicating how many times the quasi-likelihood estimation algorithm should converge before the grid search across different initial parameter values is truncated. Defaults to 3L. If nconvstop >= maxgridrows, no early stopping rule will be used.

zerosallowed

A logical indicating whether 0 values are acceptable in the initial values of the parameter vector. Defaults to FALSE.

maxitql

An integer specifying the maximum number of iterations to run in the Gauss-Newton algorithm for quasi-likelihood estimation. Defaults to 100L.

tolql

A double specifying the convergence criterion for the Gauss-Newton algorithm; defaults to 1e-8. The criterion is applied to the L_2 norm of the difference between parameter vectors in successive iterations.

nestedql

A logical indicating whether to use the nested updating step suggested in \insertCiteSeber03;textualskedastic. Defaults to FALSE due to the large computation time required.

reduce2homosked

A logical indicating whether the homoskedastic error variance estimator e'e/(n-p) should be used if the variable selection procedure does not select any variables. Defaults to TRUE.

cvoption

A character, either "testsetols" or "partitionres", indicating how to obtain the observed response values for each test fold when performing K-fold cross-validation on an ALVM. The default technique, "testsetols", entails fitting a linear regression model to the test fold of observations from the original response vector y and predictor matrix X. The squared residuals from this regression are the observed responses that are predicted from the trained model to compute the cross-validated squared error loss function. Under the other technique, "partitionres", the squared residuals from the full linear regression model are partitioned into training and test folds and the squared residuals in the test fold are the observed responses that are predicted for computation of the cross-validated loss.

nfolds

An integer specifying the number of folds K to use for cross-validation, if the \lambda and/or n_c hyperparameters are to be tuned using cross-validation. Defaults to 5L. One must ensure that each test fold contains at least p+1 observations if the "testsetols" technique is used with cross-validation, so that there are enough degrees of freedom to fit a linear model to the test fold.

...

Other arguments that can be passed to (non-exported) helper functions, namely:

  • greedy, a logical passed to the functions implementing best subset selection, indicating whether or not to use a greedy search rather than exhaustive search for the best subset. Defaults to FALSE, but coerced to TRUE unconditionally if p>9.

  • distmetric, a character specifying the distance metric to use in computing distance for the clustering algorithm. Corresponds to the method argument of dist and defaults to "euclidean"

  • linkage, a character specifying the linkage rule to use in agglomerative hierarchical clustering. Corresponds to the method argument of hclust and defaults to "complete"

  • alpha, a double specifying the significance level threshold to use when applying heteroskedasticity test for the purpose of feature selection in an ALVM; defaults to 0.1

  • testname, a character corresponding to the name of a function that performs a heteroskedasticity test. The function must either be one that takes a deflator argument or breusch_pagan. Defaults to evans_king

Details

The ANLVM model equation is

e_i^2=\displaystyle\sum_{k=1}^{n} g(X_{k\cdot}'\gamma) m_{ik}^2+u_i

, where e_i is the ith Ordinary Least Squares residual, X_{k\cdot} is a vector corresponding to the kth row of the n\times p design matrix X, m_{ik}^2 is the (i,k)th element of the annihilator matrix M=I-X(X'X)^{-1}X', u_i is a random error term, \gamma is a p-vector of unknown parameters, and g(\cdot) is a continuous, differentiable function that need not be linear in \gamma, but must be expressible as a function of the linear predictor X_{k\cdot}'\gamma. This method has been developed as part of the author's doctoral research project.

The parameter vector \gamma is estimated using the maximum quasi-likelihood method as described in section 2.3 of \insertCiteSeber03;textualskedastic. The optimisation problem is solved numerically using a Gauss-Newton algorithm.

For further discussion of feature selection and the methods for choosing the number of clusters to use with the clustering version of the model, see alvm.fit.

Value

An object of class "anlvm.fit", containing the following:

  • coef.est, a vector of parameter estimates, \hat{\gamma}

  • var.est, a vector of estimates \hat{\omega} of the error variances for all observations

  • method, either "cluster" or "functionalform", depending on whether cluster was set to TRUE

  • ols, the lm object corresponding to the original linear regression model

  • fitinfo, a list containing three named objects, g (the heteroskedastic function), Msq (the elementwise-square of the annihilator matrix M), Z (the design matrix used in the ANLVM, after feature selection if applicable), and clustering (a list object with results of the clustering procedure, if applicable).

  • selectinfo, a list containing two named objects, varselect (the value of the eponymous argument), and selectedcols (a numeric vector with column indices of X that were selected, with 1 denoting the intercept column)

  • qlinfo, a list containing nine named objects: converged (a logical, indicating whether the Gauss-Newton algorithm converged for at least one initial value of the parameter vector), iterations (the number of Gauss-Newton iterations used to obtain the parameter estimates returned), Smin (the minimum achieved value of the objective function used in the Gauss-Newton routine), and six arguments passed to the function (nested, param.init, maxgridrows, nconvstop, maxitql, and tolql)

References

\insertAllCited

See Also

alvm.fit, avm.ci

Examples

mtcars_lm <- lm(mpg ~ wt + qsec + am, data = mtcars)
myanlvm <- anlvm.fit(mtcars_lm, g = function(x) x ^ 2,
 varselect = "qgcv.linear")


skedastic documentation built on May 29, 2024, 12:20 p.m.