GCElm: Generalized Cross Entropy Linear Regression Models

View source: R/GCElm.r

GCElmR Documentation

Generalized Cross Entropy Linear Regression Models

Description

Fitting generalized cross entropy (GCE) linear models

Usage

GCElm(formula, data, Z, v, nu, p0, w0, k.sigma = 3, weights, subset,
  na.action, control = list(), model = TRUE, method = "GCElm.fit",
  x = FALSE, y = TRUE, offset, contrasts = NULL, ...)

Arguments

formula

an object of class "formula" (or one that can be coerced to that class); a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which GCElm is called.

Z

numeric, an (KxM) matrix representing support spaces for the regression coefficients (including intercept) where M is the dimension of the support spaces.

v

numeric, an optional argument representing a support space for error terms:

(a)

if missing then v is a (5x1) vector of equally spaced points in [a,b] interval;

(b)

if a scalar (e.g. H) then v is a (Hx1) vector of equally spaced points in [a,b] interval;

(c)

can be a user-supplied vector;

(d)

can be a user-supplied matrix.

Please note that in case (a) and (b) the [a,b] interval is centered around zero, and a and b are calculated using the empirical three-sigma rule Pukelsheim (1994).

nu

numeric, an optional weight parameter representing the trade-off between prediction and precision.

p0

numeric, optional prior probabilities associated with the regression coefficients.

w0

numeric, optional prior probabilities associated with the error terms.

k.sigma

numeric, coefficient k in the k-sigma rule (default k=3).

weights

an optional vector of ‘prior weights’ to be used in the fitting process; should be NULL or a numeric vector.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs; the default is set by the na.action setting of options, and is na.fail if that is unset; the ‘factory-fresh’ default is na.omit; another possible value is NULL, no action; value na.exclude can be useful.

control

list, a list of parameters for controlling the fitting process; for GCElm.fit this is passed to GCElm.control.

model

a logical value indicating whether model frame should be included as a component of the returned value.

method

the method to be used in fitting the model; the default method GCElm.fit uses Limited-memory BFGS (L-BFGS); the alternative "model.frame" returns the model frame and does no fitting.

x

logical values indicating whether the model matrix used in the fitting process should be returned as components of the returned value.

y

logical values indicating whether the response vector used in the fitting process should be returned as components of the returned value.

offset

this can be used to specify an a priori known component to be included in the linear predictor during fitting; this should be NULL or a numeric vector of length equal to the number of cases; one or more offset terms can be included in the formula instead or as well, and if more than one is specified their sum is used; see model.offset.

contrasts

an optional list; see the contrasts.arg of model.matrix.default.

...

for GCElm: arguments to be used to form the default control argument if it is not supplied directly; for weights: further arguments passed to or from other methods.

Details

Mettere qui eventuali details.

Value

A list with the following elements:

* lambda, estimated lagrange multipliers

* beta, regression coefficients

* var_beta, variance-covariance matrix of the regression coefficients

* p, estimated probabilities associated with the regressions coefficients

* w, estimated probabilities associated with the error terms

* e, estimated residuals

* Sp, the (signal) information of the whole system

* Sp_k, the (signal) information associated with the k-th regression coefficient

* H_p_w, value of the joint entropies of p and w at the final iteration

* dH, delta-H from the Entropy Concentration Theorem

* ER, entropy-ratio statistic

* Pseudo-R2, pseudo R-squared

* converged, convergence (same as in the lbfgs function)

Author(s)

Marco Sandri, Enrico Ciavolino, Maurizio Carpita (gcemodels@gmail.com)

References

Golan (1996)

Examples

set.seed(1234)
N <- 5000
K <- 10
X <- matrix(runif(N*K), nrow = N, ncol = K)
data <- data.frame(y = runif(N), X)
Z <- matrix(rep(c(-10, 5, 0, 5, 10), K+1), nrow = K+1, byrow = TRUE)
GCEfit <- GCElm(y~., data=data, Z=Z)
(beta = GCEfit$beta)

gcemodels/GCEmodels documentation built on Aug. 10, 2024, 1:58 a.m.