Description Usage Arguments Details Value Author(s) See Also Examples
CVmFold
returns the m-fold Cross-validation prediction error of different divergences
for generalised linear models.
1 2 |
y |
is a (n x 1) vector of response variable. |
X |
is a (n x p) matrice of predictors. |
m |
is the number of folds. |
K |
is the number of repetitions. |
family |
the family object for |
type |
the type of prediction required for |
divergence |
the type of divergence. |
C0 |
is a cutoff value between (0,1) |
W |
is a matrix of weights for classification errors (if |
increasing |
is a boolean characterising |
trace |
if |
... |
additional arguments affecting the fitting method (see |
This function computes the m-fold Cross-validation (CV) of a Generalised Linear Models family
to assesses the prediction error according to a specific divergence
. It is called inside
InitialStep
and GeneralStep
functions, the two
main functions of the Panning Algorithm.
In the case divergence = "classification"
, it is possible to have asymmetric
classification errors by setting the W
matrix (rows: estimated y; columns: true y) (see the
example below). For logistic regression (runned with glm
), the cutoff
value C0
determines whether the prediction takes value 0 (prediction <=C0
) or 1
(prediction >C0
). For multinomial regression, increasing=TRUE
states
y
>=1 with unit increments (it makes CVmFold
runs faster).
Attention should be taken on how the estimated values of y
should be returned,
and choose type
accordingly. See the example below on logistic regression.
CVmFold
returns a single numeric value assessing the estimated prediction error.
Samuel Orso Samuel.Orso@unige.ch
glm
, family
, predict.glm
,
InitialStep
, GeneralStep
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | ## Not run:
### Binary data
# load the data
library(MASS)
data("birthwt")
y <- birthwt$low
X <- as.matrix(birthwt)[,-1]
## logistic regression with glm()
# L1 error
set.seed(123)
CVmFold(y = y, X = X, family = binomial(link = 'logit'), divergence = "L1",
type = "response", trace = FALSE, control = list(maxit=100) )
# Squared error
set.seed(123)
CVmFold(y = y, X = X, family = binomial(link = 'logit'), divergence = "sq.error",
type = "response", trace = FALSE, control = list(maxit=100) )
# misclassification error
set.seed(123)
CVmFold(y = y, X = X, family = binomial(link = 'logit'), divergence = "classification",
type = "response", trace = FALSE, control = list(maxit=100) )
# asymmetric misclassification error
Weight <- matrix(c(0,1.5,0.5,0),2,2)
set.seed(123)
CVmFold(y = y, X = X, family = binomial(link = 'logit'), divergence = "classification",
W = Weight, type = "response", trace = FALSE, control = list(maxit=100) )
## logistic regression with multinom()
# L1 error
set.seed(123)
CVmFold(y = y, X = X, family = "multinomial", divergence = "L1", type = "probs" )
# Squared Error
set.seed(123)
CVmFold(y = y, X = X, family = "multinomial", divergence = "sq.error", type = "probs" )
# misclassification error
y <- y+1L
set.seed(123)
CVmFold(y = y, X = X, family = "multinomial", divergence = "classification",
type = "class", increasing = TRUE )
# asymmetric misclassification error
set.seed(123)
CVmFold(y = y, X = X, family = "multinomial", divergence = "classification",
type = "class", W = Weight, increasing = TRUE )
### Count data
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
set.seed(123)
CVmFold(y = counts, X = cbind(outcome, treatment), m = 3, K = 30, family = poisson(),
divergence = "L1" )
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.