# splitval: Split-sample-derived Shrinkage After Estimation In apricom: Tools for the a Priori Comparison of Regression Modelling Strategies

## Description

Shrink regression coefficients using a split-sample-derived shrinkage factor.

## Usage

 `1` ```splitval(dataset, model, nrounds, fract, sdm, int = TRUE, int.adj) ```

## Arguments

 `dataset` a dataset for regression analysis. Data should be in the form of a matrix, with the outcome variable as the final column. Application of the `datashape` function beforehand is recommended, especially if categorical predictors are present. For regression with an intercept included a column vector of 1s should be included before the dataset (see examples) `model` type of regression model. Either "linear" or "logistic". `nrounds` the number of times to replicate the sample splitting process. `fract` the fraction of observations designated to the training set `sdm` a shrinkage design matrix. For examples, see `ols.shrink` `int` logical. If TRUE the model will include a regression intercept. `int.adj` logical. If TRUE the regression intercept will be re-estimated after shrinkage of the regression coefficients.

## Details

This function applies sample-splitting to a dataset in order to derive a shrinkage factor and apply it to the regression coefficients. Data are randomly split into two sets, a training set and a test set. Regression coefficients are estimated using the training sample, and then a shrinkage factor is estimated using the test set. The mean of N shrinkage factors is then applied to the original regression coeffients, and the regression intercept may be re-estimated.

This process can currently be applied to linear or logistic regression models.

## Value

`splitval` returns a list containing the following:

 `raw.coeff` the raw regression model coefficients, pre-shrinkage. `shrunk.coeff` the shrunken regression model coefficients `lambda` the mean shrinkage factor over Nrounds split-sample replicates `Nrounds` the number of rounds of sample splitting `sdm` the shrinkage design matrix used to apply the shrinkage factor(s) to the regression coefficients

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18``` ```## Example 1: Linear regression using the iris dataset ## Split-sample-derived shrinkage with 100 rounds of sample-splitting data(iris) iris.data <- as.matrix(iris[, 1:4]) iris.data <- cbind(1, iris.data) sdm1 <- matrix(c(0, 1, 1, 1), nrow = 1) set.seed(321) splitval(dataset = iris.data, model = "linear", nrounds = 100, fract = 0.75, sdm = sdm1, int = TRUE, int.adj = TRUE) ## Example 2: logistic regression using a subset of the mtcars data ## Split-sample-derived shrinkage data(mtcars) mtc.data <- cbind(1,datashape(mtcars, y = 8, x = c(1, 6, 9))) head(mtc.data) set.seed(123) splitval(dataset = mtc.data, model = "logistic", nrounds = 100, fract = 0.5) ```

apricom documentation built on May 2, 2019, 6:21 a.m.