cv.fwelnet: Cross-validation for fwelnet

Description Usage Arguments Details Value Examples

View source: R/cv.fwelnet.R

Description

Does k-fold cross-validation for fwelnet.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
cv.fwelnet(
  x,
  y,
  z,
  family = c("gaussian", "binomial"),
  lambda = NULL,
  type.measure = c("mse", "deviance", "class", "auc", "mae"),
  nfolds = 10,
  foldid = NULL,
  keep = FALSE,
  verbose = FALSE,
  ...
)

Arguments

x

x matrix as in fwelnet.

y

y matrix as in fwelnet.

z

z matrix as in fwelnet.

family

Response type. Either "gaussian" (default) for linear regression or "binomial" for logistic regression.

lambda

A user supplied lambda sequence. Typical usage is to have the program compute its own lambda sequence; supplying a value of lambda overrides this.

type.measure

Loss to use for cross-validation. Currently five options, not all available for all models. The default is type.measure="deviance", which uses squared-error for gaussian models (a.k.a type.measure="mse" there) and deviance for logistic regression. type.measure="class" applies to binomial logistic regression only, and gives misclassification error. type.measure="auc" is for two-class logistic regression only, and gives area under the ROC curve. type.measure="mse" or type.measure="mae" (mean absolute error) can be used by all models.

nfolds

Number of folds for CV (default is 10). Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds = 3.

foldid

An optional vector of values between 1 and nfolds identifying what fold each observation is in. If supplied, nfolds can be missing.

keep

If keep = TRUE, a prevalidated array is returned containing fitted values for each observation at each value of lambda. This means these fits are computed with this observation and the rest of its fold omitted. Default is FALSE.

verbose

Print information as model is being fit? Default is FALSE.

...

Other arguments that can be passed to fwelnet.

Details

This function runs fwelnet nfolds+1 times: the first to get the lambda sequence, and the remaining nfolds times to compute the fit with each of the folds omitted. The error is accumulated, and the mean error and standard deviation over the folds is computed. Note that cv.pcLasso does NOT search for values of alpha. A specific value of alpha should be supplied.

Value

An object of class "cv.fwelnet", which is a list with the ingredients of the cross-validation fit.

glmfit

A fitted fwelnet object for the full data.

lambda

The values of lambda used in the fits.

nzero

The number of non-zero coefficients in the model glmfit.

fit.preval

If keep=TRUE, this is the array of prevalidated fits.

cvm

The mean cross-validated error: a vector of length length(lambda).

cvsd

Estimate of standard error of cvm.

cvlo

Lower curve = cvm - cvsd.

cvup

Upper curve = cvm + cvsd.

lambda.min

The value of lambda that gives minimum cvm.

lambda.1se

The largest value of lambda such that the CV error is within one standard error of the minimum.

foldid

If keep=TRUE, the fold assignments used.

name

Name of error measurement used for CV.

call

The call that produced this object.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
set.seed(1)
n <- 100; p <- 20
x <- matrix(rnorm(n * p), n, p)
beta <- matrix(c(rep(2, 5), rep(0, 15)), ncol = 1)
y <- x %*% beta + rnorm(n)
z <- cbind(1, abs(beta) + rnorm(p))

cvfit1 <- cv.fwelnet(x, y, z)

# change no. of CV folds
cvfit2 <- cv.fwelnet(x, y, z, nfolds = 5)
# specify which observations are in each fold
foldid <- sample(rep(seq(5), length = length(y)))
cvfit3 <- cv.fwelnet(x, y, z, foldid = foldid)
# keep=TRUE to have pre-validated fits and foldid returned
cvfit4 <- cv.fwelnet(x, y, z, keep = TRUE)

kjytay/fwelnet documentation built on June 9, 2020, 1:39 p.m.