Leave-one-out Cross-validation-derived Shrinkage

Description

Shrink regression coefficients using a shrinkage factor derived using leave-one-out cross-validation.

Usage

1
loocval(dataset, model, nreps = 1, sdm, int = TRUE, int.adj)

Arguments

dataset

a dataset for regression analysis. Data should be in the form of a matrix, with the outcome variable as the final column. Application of the datashape function beforehand is recommended, especially if categorical predictors are present. For regression with an intercept included a column vector of 1s should be included before the dataset (see examples)

model

type of regression model. Either "linear" or "logistic".

nreps

the number of times to replicate the cross-validation process.

sdm

a shrinkage design matrix.

int

logical. If TRUE the model will include a regression intercept.

int.adj

logical. If TRUE the regression intercept will be re-estimated after shrinkage of the regression coefficients.

Details

This function applies leave-one-out cross-validation to a dataset in order to derive a shrinkage factor and apply it to the regression coefficients. One row of the data is used as a validation set, while the remaining data is used as a training set. Regression coefficients are estimated in the training set, and then a shrinkage factor is estimated using the validation set. This process is repeated so that each data row is used as the validation set once. The mean of the shrinkage factors is then applied to the original regression coeffients, and the regression intercept may be re-estimated. This process may be repeated nreps times but each rep should yield the same shrunken coefficients.

This process can currently be applied to linear or logistic regression models.

Value

loocval returns a list containing the following:

raw.coeff

the raw regression model coefficients, pre-shrinkage.

shrunk.coeff

the shrunken regression model coefficients.

lambda

the mean shrinkage factor over nreps cross-validation replicates.

nreps

the number of cross-validation replicates.

sdm

the shrinkage design matrix used to apply the shrinkage factor(s) to the regression coefficients.

Note

Warning: this method is not recommended for use in practice. Due to the high variance and inherent instability of leave-one-out methods the value of the shrinkage factor may be extreme.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
## Example 1: Linear regression using the iris dataset
## Leave-one-out cross-validation-derived shrinkage
data(iris)
iris.data <- as.matrix(iris[, 1:4])
iris.data <- cbind(1, iris.data)
sdm1 <- matrix(c(0, 1, 1, 1), nrow = 1)
set.seed(123)
loocval(dataset = iris.data, model = "linear", sdm = sdm1,
int = TRUE, int.adj = TRUE)

## Example 2: logistic regression using a subset of the mtcars data
## Leave-one-out cross-validation-derived shrinkage
data(mtcars)
mtc.data <- cbind(1,datashape(mtcars, y = 8, x = c(1, 6, 9)))
head(mtc.data)
set.seed(123)
loocval(dataset = mtc.data, model = "logistic")