Description Usage Arguments Value Reference See Also Examples
Performs grid search to estimate the optimal hyperparameters (gamma
and cost
)
within specified space based on double asymptotic risk estimation or cross validation.
Double asymptotic risk estimation is more efficient to compute because it uses closed form for risk estimation.
For further details, refer to the article in the reference section.
R = e_0 * C_10 + e_1 * C_01)
e_i = CDF((-1)^(i+1) (Ghat_i + omega_opt/gamma) / sqrt(Dhat))
Separate sampling cross-validation (see cross-validation function) was adapted to work with cost-based risk estimation.
1 2 3 4 5 6 7 8 9 | grid_search(
x,
y,
range_gamma,
range_cost,
method = "estimator",
nfolds = 10,
bias_correction = TRUE
)
|
x |
Input matrix or data.frame of dimension |
y |
A numeric vector or factor of class labels. Factor should have either two levels or be
a vector with two distinct values.
If |
range_gamma |
Vector of |
range_cost |
nobs x 1 vector (values should be between 0 and 1) or nobs x 2 matrix (each row is cost pair value c(C_10, C_01)) of cost values to check. |
method |
Selects method to evaluete risk. "estimator" and "cross". |
nfolds |
Number of folds to use with cross-validation. Default is 10.
In case of imbalanced data, |
bias_correction |
Takes in a boolean value.
If |
List of estimated parameters.
cost |
Cost value for which risk estimates are lowest during the search. |
gamma |
Gamma regularization parameter for which risk estimates are lowest during the search. |
risk |
Lowest risk value estimated during grid search. |
A. Zollanvari, M. Abdirash, A. Dadlani and B. Abibullaev, "Asymptotically Bias-Corrected Regularized Linear Discriminant Analysis for Cost-Sensitive Binary Classification," in IEEE Signal Processing Letters, vol. 26, no. 9, pp. 1300-1304, Sept. 2019. doi: 10.1109/LSP.2019.2918485 URL: https://ieeexplore.ieee.org/document/8720003
Braga-Neto, Ulisses & Zollanvari, Amin & Dougherty, Edward. (2014). Cross-Validation Under Separate Sampling: Strong Bias and How to Correct It. Bioinformatics (Oxford, England). 30. 10.1093/bioinformatics/btu527. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4296143/pdf/btu527.pdf
Other functions in the package:
abcrlda()
,
cross_validation()
,
da_risk_estimator()
,
predict.abcrlda()
,
risk_calculate()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | data(iris)
train_data <- iris[which(iris[, ncol(iris)] == "virginica" |
iris[, ncol(iris)] == "versicolor"), 1:4]
train_label <- factor(iris[which(iris[, ncol(iris)] == "virginica" |
iris[, ncol(iris)] == "versicolor"), 5])
cost_range <- seq(0.1, 0.9, by = 0.2)
gamma_range <- c(0.1, 1, 10, 100, 1000)
gs <- grid_search(train_data, train_label,
range_gamma = gamma_range,
range_cost = cost_range,
method = "estimator")
model <- abcrlda(train_data, train_label,
gamma = gs$gamma, cost = gs$cost)
predict(model, train_data)
cost_range <- matrix(1:10, ncol = 2)
gamma_range <- c(0.1, 1, 10, 100, 1000)
gs <- grid_search(train_data, train_label,
range_gamma = gamma_range,
range_cost = cost_range,
method = "cross")
model <- abcrlda(train_data, train_label,
gamma = gs$gamma, cost = gs$cost)
predict(model, train_data)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.