# cross_validation: Cross Validation for separate sampling adjusted for cost. In abcrlda: Asymptotically Bias-Corrected Regularized Linear Discriminant Analysis

## Description

This function implements Cross Validation for separate sampling adjusted for cost.

## Usage

 ```1 2 3 4 5 6 7 8``` ```cross_validation( x, y, gamma = 1, cost = c(0.5, 0.5), nfolds = 10, bias_correction = TRUE ) ```

## Arguments

 `x` Input matrix or data.frame of dimension `nobs x nvars`; each row is an feature vector. `y` A numeric vector or factor of class labels. Factor should have either two levels or be a vector with two distinct values. If `y` is presented as a vector, it will be coerced into a factor. Length of `y` has to correspond to number of samples in `x`. `gamma` Regularization parameter gamma in the ABC-RLDA discriminant function given by: W_ABCRLDA = gamma (x - (x0 + x1)/2) H (x0 - x1) + log(C_01/C_10) + omega_opt H = (I_p + gamma Sigma_hat)^-1 Formulas and derivations for parameters used in above equation can be found in the article under reference section. `cost` Parameter that controls the overall misclassification costs. This is a vector of length 1 or 2 where the first value is C_10 (represents the cost of assigning label 1 when the true label is 0) and the second value, if provided, is C_01 (represents the cost of assigning label 0 when the true label is 1). The default setting is c(0.5, 0.5), so both classes have equal misclassification costs If a single value is provided, it should be normalized to lie between 0 and 1 (but not including 0 or 1). This value will be assigned to C_10 while C_01 will be equal to 1 - C_10. `nfolds` Number of folds to use with cross-validation. Default is 10. In case of imbalanced data, `nfolds` should not be greater than the number of observations in smaller class. `bias_correction` Takes in a boolean value. If `bias_correction` is TRUE, then asymptotic bias correction will be performed. Otherwise, (if `bias_correction` is FALSE) asymptotic bias correction will not be performed and the ABCRLDA is the classical RLDA. The default is TRUE.

## Value

Returns list of parameters.

 `risk_cross` Returns risk estimation where R = e_0 * C_10 + e_1 * C_01) `e_0` Error estimate for class 0. `e_1` Error estimate for class 1.

## Reference

Braga-Neto, Ulisses & Zollanvari, Amin & Dougherty, Edward. (2014). Cross-Validation Under Separate Sampling: Strong Bias and How to Correct It. Bioinformatics (Oxford, England). 30. 10.1093/bioinformatics/btu527. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4296143/pdf/btu527.pdf

Other functions in the package: `abcrlda()`, `da_risk_estimator()`, `grid_search()`, `predict.abcrlda()`, `risk_calculate()`
 ```1 2 3 4 5 6``` ```data(iris) train_data <- iris[which(iris[, ncol(iris)] == "virginica" | iris[, ncol(iris)] == "versicolor"), 1:4] train_label <- factor(iris[which(iris[, ncol(iris)] == "virginica" | iris[, ncol(iris)] == "versicolor"), 5]) cross_validation(train_data, train_label, gamma = 10) ```