View source: R/pense_regression.R
pense_cv | R Documentation |
Perform (repeated) K-fold cross-validation for pense()
.
adapense_cv()
is a convenience wrapper to compute adaptive
PENSE estimates.
pense_cv(
x,
y,
standardize = TRUE,
lambda,
cv_k,
cv_repl = 1,
cv_metric = c("tau_size", "mape", "rmspe", "auroc"),
fit_all = TRUE,
fold_starts = c("full", "enpy", "both"),
cl = NULL,
...
)
adapense_cv(x, y, alpha, alpha_preliminary = 0, exponent = 1, ...)
x |
|
y |
vector of response values of length |
standardize |
whether to standardize the |
lambda |
optional user-supplied sequence of penalization levels. If given and not |
cv_k |
number of folds per cross-validation. |
cv_repl |
number of cross-validation replications. |
cv_metric |
either a string specifying the performance metric to use, or a function to evaluate prediction errors in a single CV replication. If a function, the number of arguments define the data the function receives. If the function takes a single argument, it is called with a single numeric vector of prediction errors. If the function takes two or more arguments, it is called with the predicted values as first argument and the true values as second argument. The function must always return a single numeric value quantifying the prediction performance. The order of the given values corresponds to the order in the input data. |
fit_all |
If |
fold_starts |
how to determine starting values in the
cross-validation folds. If |
cl |
a parallel cluster. Can only be used in combination with
|
... |
Arguments passed on to
|
alpha |
elastic net penalty mixing parameter with |
alpha_preliminary |
|
exponent |
the exponent for computing the penalty loadings based on the preliminary estimate. |
The built-in CV metrics are
"tau_size"
\tau
-size of the prediction error, computed by
tau_size()
(default).
"mape"
Median absolute prediction error.
"rmspe"
Root mean squared prediction error.
"auroc"
Area under the receiver operator characteristic curve (actually 1 - AUROC). Only sensible for binary responses.
adapense_cv()
is a convenience wrapper which performs 3 steps:
compute preliminary estimates via
pense_cv(..., alpha = alpha_preliminary)
,
computes the penalty loadings from the estimate beta
with best
prediction performance by
adapense_loadings = 1 / abs(beta)^exponent
, and
compute the adaptive PENSE estimates via
pense_cv(..., penalty_loadings = adapense_loadings)
.
a list-like object with the same components as returned by pense()
,
plus the following:
cvres
data frame of average cross-validated performance.
a list-like object as returned by pense_cv()
plus the following
preliminary
the CV results for the preliminary estimate.
exponent
exponent used to compute the penalty loadings.
penalty_loadings
penalty loadings used for the adaptive PENSE estimate.
pense()
for computing regularized S-estimates without
cross-validation.
coef.pense_cvfit()
for extracting coefficient estimates.
plot.pense_cvfit()
for plotting the CV performance or the
regularization path.
Other functions to compute robust estimates with CV:
pensem_cv()
,
regmest_cv()
Other functions to compute robust estimates with CV:
pensem_cv()
,
regmest_cv()
# Compute the adaptive PENSE regularization path for Freeny's
# revenue data (see ?freeny)
data(freeny)
x <- as.matrix(freeny[ , 2:5])
## Either use the convenience function directly ...
set.seed(123)
ada_convenience <- adapense_cv(x, freeny$y, alpha = 0.5,
cv_repl = 2, cv_k = 4)
## ... or compute the steps manually:
# Step 1: Compute preliminary estimates with CV
set.seed(123)
preliminary_estimate <- pense_cv(x, freeny$y, alpha = 0,
cv_repl = 2, cv_k = 4)
plot(preliminary_estimate, se_mult = 1)
# Step 2: Use the coefficients with best prediction performance
# to define the penalty loadings:
prelim_coefs <- coef(preliminary_estimate, lambda = 'min')
pen_loadings <- 1 / abs(prelim_coefs[-1])
# Step 3: Compute the adaptive PENSE estimates and estimate
# their prediction performance.
set.seed(123)
ada_manual <- pense_cv(x, freeny$y, alpha = 0.5,
cv_repl = 2, cv_k = 4,
penalty_loadings = pen_loadings)
# Visualize the prediction performance and coefficient path of
# the adaptive PENSE estimates (manual vs. automatic)
def.par <- par(no.readonly = TRUE)
layout(matrix(1:4, ncol = 2, byrow = TRUE))
plot(ada_convenience$preliminary)
plot(preliminary_estimate)
plot(ada_convenience)
plot(ada_manual)
par(def.par)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.