View source: R/ridgeGLMandCo.R
optPenaltyGLM.kCVauto | R Documentation |
Function finds the optimal penalty parameter of the targeted ridge regression estimator of the generalized linear model parameter. The optimum is defined as the minimizer of the cross-validated loss associated with the estimator.
optPenaltyGLM.kCVauto(Y, X, U=matrix(ncol=0, nrow=length(Y)), lambdaInit,
lambdaGinit=0, Dg=matrix(0, ncol=ncol(X), nrow=ncol(X)),
model="linear", target=rep(0, ncol(X)),
folds=makeFoldsGLMcv(min(10, length(X)), Y, model=model),
loss="loglik", lambdaMin=10^(-5),
lambdaGmin=10^(-5), minSuccDiff=10^(-5), maxIter=100,
implementation="org")
Y |
A |
X |
The design |
U |
The design |
lambdaInit |
A |
lambdaGinit |
A |
Dg |
A non-negative definite |
model |
A |
target |
A |
folds |
A |
loss |
A |
lambdaMin |
A positive |
lambdaGmin |
A positive |
minSuccDiff |
A |
maxIter |
A |
implementation |
A |
The function returns a all-positive numeric
, the cross-validated optimal penalty parameters. The average loglikelihood over the left-out samples is used as the cross-validation criterion. If model="linear"
, also the average sum-of-squares over the left-out samples is offered as cross-validation criterion.
The joint selection of penalty parameters \lambda
and \lambda_g
through the optimization of the cross-validated loss may lead to a locally optimal choice. This is due to the fact that the penalties are to some extent communicating vessels. Both shrink towards the same target, only in slightly (dependending on the specifics of the generalized penalty matrix \Delta
) different ways. As such, the shrinkage achieved by one penalty may be partially compensated for by the other. This may hamper the algorithm in its search for the global optimizers.
Moreover, the penalized IRLS (Iterative Reweighted Least Squares) algorithm for the evaluation of the generalized ridge logistic regression estimator and implemented in the ridgeGLM
-function may fail to converge for small penalty parameter values in combination with a nonzero shrinkage target. This phenomenon propogates to the optPenaltyGLM.kCVauto
-function.
W.N. van Wieringen.
van Wieringen, W.N. Binder, H. (2022), "Sequential learning of regression models by penalized estimation", accepted.
Lettink, A., Chinapaw, M.J.M., van Wieringen, W.N. et al. (2022), "Two-dimensional fused targeted ridge regression for health indicator prediction from accelerometer data", submitted.
# set the sample size
n <- 50
# set the true parameter
betas <- (c(0:100) - 50) / 20
# generate covariate data
X <- matrix(rnorm(length(betas)*n), nrow=n)
# sample the response
probs <- exp(tcrossprod(betas, X)[1,]) / (1 + exp(tcrossprod(betas, X)[1,]))
Y <- numeric()
for (i in 1:n){
Y <- c(Y, sample(c(0,1), 1, prob=c(1-probs[i], probs[i])))
}
# tune the penalty parameter
optLambda <- optPenaltyGLM.kCVauto(Y, X, lambdaInit=1, fold=5,
target=betas/2, model="logistic",
minSuccDiff=10^(-3))
# estimate the logistic regression parameter
bHat <- ridgeGLM(Y, X, lambda=optLambda, target=betas/2, model="logistic")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.