penaltyParameter: Compute the penalty parameter for the model. In DWDLargeR: Fast Algorithms for Large Scale Generalized Distance Weighted Discrimination

Description

Find the best penalty parameter C for the generalized distance weighted discrimination (DWD) model.

Usage

 `1` ```penaltyParameter(X,y,expon,rmzeroFea = 1, scaleFea = 1) ```

Arguments

 `X` A d x n matrix of n training samples with d features. `y` A vector of length n of training labels. The element of `y` is either -1 or 1. `expon` A positive number representing the exponent q of the residual r_i in the generalized DWD model. Common choices are `expon = 1,2,4`. `rmzeroFea` Switch for removing zero features in the data matrix. Default is set to be 1 (removing zero features). `scaleFea` Switch for scaling features in the data matrix. This is to make the features having roughly similar magnitude. Default is set to be 1 (scaling features).

Details

The best parameter is empirically found to be inversely proportional to the typical distance between different samples raised to the power of (expon+1). It is also dependent on the sample size n and feature dimension d.

Value

A number which represents the best penalty parameter for the generalized DWD model.

Author(s)

Xin-Yee Lam, J.S. Marron, Defeng Sun, and Kim-Chuan Toh

References

Lam, X.Y., Marron, J.S., Sun, D.F., and Toh, K.C. (2018) “Fast algorithms for large scale generalized distance weighted discrimination", Journal of Computational and Graphical Statistics, forthcoming.
https://arxiv.org/abs/1604.05473

Examples

 ```1 2 3 4``` ```# load the data data("mushrooms") # calculate the best penalty parameter C = penaltyParameter(mushrooms\$X,mushrooms\$y,expon=1) ```

