| cv.savvyPR | R Documentation |
Performs k-fold cross-validation for Parity Regression (PR) models to select optimal tuning parameters. The underlying PR methodology distributes the total prediction error evenly across all parameters, ensuring stability in the presence of high multicollinearity and substantial noise (such as time series data with structural changes and evolving trends). This function supports both Budget-based and Target-based parameterizations and evaluates models across a variety of loss metrics.
cv.savvyPR(
x,
y,
method = c("budget", "target"),
vals = NULL,
nval = 100,
lambda_vals = NULL,
nlambda = 100,
folds = 10,
model_type = c("PR3", "PR1", "PR2"),
measure_type = c("mse", "mae", "rmse", "mape"),
foldid = FALSE,
use_feature_selection = FALSE,
standardize = FALSE,
intercept = TRUE,
exclude = NULL
)
x |
A matrix of predictors with rows as observations and columns as variables. Must not contain |
y |
A numeric vector of the response variable, should have the same number of observations as |
method |
Character string specifying the parameterization method to use: |
vals |
Optional; a numeric vector of values for tuning the PR model (acts as |
nval |
Numeric value specifying the number of tuning values to try in the optimization process if |
lambda_vals |
Optional; a numeric vector of |
nlambda |
Numeric value specifying the number of |
folds |
The number of folds to be used in the cross-validation, default is |
model_type |
Character string specifying the type of model to fit. Defaults to |
measure_type |
Character vector specifying the measure to use for model evaluation. Defaults to |
foldid |
Logical indicating whether to return fold assignments. Defaults to |
use_feature_selection |
Logical indicating whether to perform feature selection during the model fitting process. Defaults to |
standardize |
Logical indicating whether to standardize predictor variables. Defaults to |
intercept |
Logical indicating whether to include an intercept in the model. Defaults to |
exclude |
Optional; indicate if any variables should be excluded in the model fitting process. |
Cross-Validation for Parity Regression Model Estimation
This function facilitates cross-validation for parity regression models across a range
of tuning values (val) and regularization values (\lambda), depending
on the model type specified. Each model type handles the parameters differently:
Performs cross-validation only over the val sequence while
fixing \lambda=0. This model type is primarily used when the focus is on
understanding how different levels of risk parity constraints impact the
model performance purely based on the parity mechanism without the influence
of ridge \lambda shrinkage.
Uses a fixed \lambda value determined by performing a ridge
regression (lambda optimization) using cv.glmnet
on the dataset. It then performs cross-validation over the val sequence
while using this optimized \lambda value. This approach is useful when
one wishes to maintain a stable amount of standard shrinkage while exploring
the impact of varying levels of the proportional contribution constraint.
First, determines an optimal val using the same method as PR1.
Then, keeping this val fixed, it conducts a cross-validation over all
possible \lambda values. This dual-stage optimization can be particularly
effective when the initial parity regularization needs further refinement
via \lambda adjustment.
The function supports several types of loss metrics for assessing model performance:
Mean Squared Error: Measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value.
Mean Absolute Error: Measures the average magnitude of the errors in a set of predictions, without considering their direction. It’s the average over the test sample of the absolute differences between prediction and actual observation where all individual differences have equal weight.
Root Mean Squared Error: It is the square root of the mean of the squared
errors. RMSE is a good measure of how accurately the model predicts the response,
and it is the most important criterion for fit if the main purpose of the model is prediction.
Mean Absolute Percentage Error: Measures the size of the error in percentage terms. It is calculated as the average of the unsigned percentage error, as shown above. Because it is based on relative errors, it is less sensitive to large deviations in small true values.
The choice of measure impacts how the model's performance is assessed during cross-validation. Users should select the measure that best reflects the requirements of their specific analytical context.
A list of class "cv.savvyPR" containing the following components based on the specified model_type:
call |
The matched call used to invoke the function. |
coefficients |
The optimal coefficients results of the final fitted model. |
mean_error_cv |
A vector of computed error values across all tested parameters. |
model_type |
The type of PR model used: |
measure_type |
The loss measure used for evaluation, with a descriptive name. |
method |
The parameterization method used: |
PR_fit |
The final fitted model object from the |
coefficients_cv |
A matrix of average coefficients across all cross-validation folds for each tuning parameter. |
vals |
The tuning values (acting as c or t) used in the cross-validation process. |
lambda_vals |
The |
optimal_val |
The optimal tuning value found from cross-validation, applicable to |
fixed_val |
The fixed tuning value used in |
optimal_lambda_val |
The optimal |
fixed_lambda_val |
The fixed |
optimal_index |
A list detailing the indices of the optimal parameters within the cross-validation matrix. |
fold_assignments |
(Optional) The fold assignments used during the cross-validation, provided if |
Ziwei Chen, Vali Asimit and Pietro Millossovich
Maintainer: Ziwei Chen <ziwei.chen.3@citystgeorges.ac.uk>
Asimit, V., Chen, Z., Ichim, B., & Millossovich, P. (2026). Prity Regression Estimation. Retrieved from https://openaccess.city.ac.uk/id/eprint/37017/
The optimization technique employed follows the algorithm described by: F. Spinu (2013). An Algorithm for Computing Risk Parity Weights. SSRN Preprint. doi:10.2139/ssrn.2297383
savvyPR, glmnet, cv.glmnet,
calcLoss, getMeasureName, optimizeRiskParityBudget, optimizeRiskParityTarget
# Generate synthetic data
set.seed(123)
n <- 100 # Number of observations
p <- 12 # Number of variables
x <- matrix(rnorm(n * p), n, p)
beta <- matrix(rnorm(p), p, 1)
y <- x %*% beta + rnorm(n, sd = 0.5)
# Example 1: PR1 with "budget" method (focusing on c values with MSE)
result_pr1_budget <- cv.savvyPR(x, y, method = "budget", model_type = "PR1")
print(result_pr1_budget)
# Example 2: PR1 with "target" method
result_pr1_target <- cv.savvyPR(x, y, method = "target", model_type = "PR1")
print(result_pr1_target)
# Example 3: PR3 (default model_type) exploring budget parameter
result_pr3 <- cv.savvyPR(x, y, method = "budget", folds = 5)
print(result_pr3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.