grcv: General Refitted Cross-Validation Estimator

View source: R/grcv.R

grcvR Documentation

General Refitted Cross-Validation Estimator

Description

grcv computes the estimate of the dispersion parameter using the general refitted cross-validation method.

Usage

grcv(object, type = c("BIC", "AIC"), nit = 10L, trace = FALSE,
     control = list(), ...)

Arguments

object

fitted dglars object.

type

the measure of goodness-of-fit used in Step 2 to select the two set of variables (see section Description for more details). Default is type = BIC.

control

a list of control parameters passed to the function dglars.

nit

integer specifying the number of times that the general refitted cross-validation method is repeated (see section Description for more details). Default is nit = 10L.

trace

flag used to print out information about the algorithm. Default is trace = FALSE.

...

further arguments passed to the functions AIC.dglars or BIC.dglars.

Details

The general refitted cross-validation (grcv) estimator (Pazira et al., 2018) is an estimator of the dispersion parameter of the exponential family based on the following four stage procedure:

Step Description
1. randomly split the data set D = (y, X) into two even datasets, denoted by D_1 and D_2.
2. fit dglars model to the dataset D_1 to select a set of variables A_1.
fit dglars model to the dataset D_2 to select a set of variables A_2.
3. fit the glm model to the dataset D_1 using the variables that are in A_2; then estimate the
disporsion parameter using the Pearson method. Denote by \hat{\phi}_1(A_2) the resulting estimate.
fit the glm model to the dataset D_2 using the variables that are in A_1; then estimate the
disporsion parameter using the Pearson method. Denote by \hat{\phi}_2(A_1) the resulting estimate.
4. estimate \phi using the following estimator: \hat{\phi}_{grcv} = (\hat{\phi}_1(A_2) + \hat{\phi}_2(A_1)) / 2.

In order to reduce the random variabilty due to the splitting of the dataset (Step 1), the previous procedure is repeated ‘nit’-times; the median of the resulting estimates is used as final estimate of the dispersion parameter. In Step 3, the two sets of variables are selected using the AIC.dglars and BIC.dglars; in this step, the Pearson method is used to obtain a first estimate of the dispersion parameter. Furthermore, if the function glm does not converge the function dglars is used to compute the maximum likelihood estimates.

Value

grcv returns the estimate of the dispersion parameter.

Author(s)

Luigi Augugliaro and Hassan Pazira
Maintainer: Luigi Augugliaro luigi.augugliaro@unipa.it

References

Pazira H., Augugliaro L. and Wit E.C. (2018) <doi:10.1007/s11222-017-9761-7> Extended differential-geometric LARS for high-dimensional GLMs with general dispersion parameter, Statistics and Computing, Vol 28(4), 753-774.

See Also

phihat, AIC.dglars and BIC.dglars.

Examples

############################
# y ~ Gamma
set.seed(321)
n <- 100
p <- 50
X <- matrix(abs(rnorm(n*p)),n,p)
eta <- 1 + 2 * X[,1]
mu <- drop(Gamma()$linkinv(eta))
shape <- 0.5
phi <- 1 / shape
y <- rgamma(n, scale = mu / shape, shape = shape)
fit <- dglars(y ~ X, Gamma("log"))

phi
grcv(fit, type = "AIC")
grcv(fit, type = "BIC")

dglars documentation built on Oct. 10, 2023, 1:08 a.m.