cross.val: Calculates cross-validated estimates of prediction error

Description Usage Arguments Details Value Note Author(s) References Examples

Description

For a logistic model, calculates cross-validated estimates of specificity, sensitivity and percentage correctly classified. For a Gaussian model, calculates a cross-validated estimate of prediction error.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
 cross.val(f,  nfold = 10, nrep = 20, ...)
 ## S3 method for class 'lm'
cross.val(f,  nfold = 10, nrep = 20, ...)
 ## S3 method for class 'glm'
cross.val(f,  nfold = 10, nrep = 20, ...)
 ## S3 method for class 'formula'
cross.val(f, nfold = 10, nrep = 20,  family = gaussian,
     data, weights, subset, na.action, start = NULL, etastart, 
     mustart, offset, control = list(...), model = TRUE, 
     method = "glm.fit", x = FALSE, y = TRUE, contrasts = NULL, ...) 
          

Arguments

f

an lm object, a glm object or a model formula

nfold

number of parts data set divided into

nrep

number of random spilts

family

a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See family for details of family functions.)

data

A data frame, list or environment containing the variables in the model.

weights

an optional vector of ‘prior weights’ to be used in the fitting process. Should be NULL or a numeric vector.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The ‘factory-fresh’ default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.

start

starting values for the parameters in the linear predictor.

etastart

starting values for the linear predictor.

mustart

starting values for the vector of means.

offset

this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. One or more offset terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See model.offset.

control

a list of parameters for controlling the fitting process. For glm.fit this is passed to glm.control.

model

a logical value indicating whether model frame should be included as a component of the returned value.

method

the method to be used in fitting the model. The default method "glm.fit" uses iteratively reweighted least squares (IWLS): the alternative "model.frame" returns the model frame and does no fitting.

x, y

For glm: logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value.

For glm.fit: x is a design matrix of dimension n * p, and y is a vector of observations of length n.

contrasts

an optional list. See the contrasts.arg of model.matrix.default.

...

additional arguments to be passed to the low level regression fitting functions see lm and glm help files

Details

The object returned depends on the class of the input.

Value

For a logistic model:

Mean Specificity

false negatives

Mean Sensitivity

false positives

Mean Correctly classified

proportion correctly classified

For a Gaussian model, an estimate of the root mean squared error is returned.

Note

This function redirects to other functions based on the type of object. eg cross.val.glm , cross.val.formula

Author(s)

Alan Lee, Blair Robertson

References

Bradley Efron and Robert Tibshirani (1993). An Introduction to the Bootstrap. Chapman and Hall, London.

Examples

1
2
3
4
5
6
7
data(fatty.df)
fatty.lm <- lm(ffa~age+weight+skinfold, data=fatty.df)
cross.val(fatty.lm)
#
data(drug.df)
cross.val(DFREE ~ NDRUGTX + factor(IVHX) + AGE + TREAT, family=binomial,
     data=drug.df)

R330 documentation built on May 2, 2019, 2:12 p.m.