hdrda: High-Dimensional Regularized Discriminant Analysis (HDRDA)
In ramey/sparsediscrim: Sparse and Regularized Discriminant Analysis

Description Usage Arguments Details Value References

Given a set of training data, this function builds the HDRDA classifier from Ramey, Stein, and Young (2017). Specially designed for small-sample, high-dimensional data, the HDRDA classifier incorporates dimension reduction and covariance-matrix shrinkage to enable a computationally efficient classifier.

For a given hdrda object, we predict the class of each observation (row) of the the matrix given in newdata.

hdrda(x, ...)

## Default S3 method:
hdrda(x, y, lambda = 1, gamma = 0,
  shrinkage_type = c("ridge", "convex"), prior = NULL, tol = 1e-06, ...)

## S3 method for class 'formula'
hdrda(formula, data, ...)

## S3 method for class 'hdrda'
predict(object, newdata, projected = FALSE, ...)

`x`	matrix containing the training data. The rows are the sample observations, and the columns are the features.
`...`	arguments passed from the `formula` to the `default` method
`y`	vector of class labels for each training observation
`lambda`	the HDRDA pooling parameter. Must be between 0 and 1, inclusively.
`gamma`	a numeric values used for the shrinkage parameter.
`shrinkage_type`	the type of covariance-matrix shrinkage to apply. By default, a ridge-like shrinkage is applied. If `convex` is given, then shrinkage similar to Friedman (1989) is applied. See Ramey et al. (2017) for details.
`prior`	vector with prior probabilities for each class. If `NULL` (default), then the sample proportion of observations belonging to each class equal probabilities are used. See details.
`tol`	a threshold for determining nonzero eigenvalues.
`formula`	A formula of the form `groups ~ x1 + x2 + ...` That is, the response is the grouping factor and the right hand side specifies the feature vectors.
`data`	data frame from which variables specified in `formula` are preferentially to be taken.
`object`	object of type `hdrda` that contains the trained HDRDA classifier
`newdata`	matrix containing the unlabeled observations to classify. Each row corresponds to a new observation.
`projected`	logical indicating whether `newdata` have already been projected to a q-dimensional subspace. This argument can yield large gains in speed when the linear transformation has already been performed.

The HDRDA classifier utilizes a covariance-matrix estimator that is a convex combination of the covariance-matrix estimators used in the Linear Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA) classifiers. For each of the K classes given in y, (k = 1, …, K), we first define this convex combination as

\hat{Σ}_k(λ) = (1 - λ) \hat{Σ}_k + λ \hat{Σ},

where λ \in [0, 1] is the pooling parameter. We then calculate the covariance-matrix estimator

\tilde{Σ}_k = α_k \hat{Σ}_k(λ) + γ I_p,

where I_p is the p \times p identity matrix. The matrix \tilde{Σ}_k is substituted into the HDRDA classifier. See Ramey et al. (2017) for more details.

The matrix of training observations are given in x. The rows of x contain the sample observations, and the columns contain the features for each training observation. The vector of class labels given in y are coerced to a factor. The length of y should match the number of rows in x.

The vector prior contains the a priori class membership for each class. If prior is NULL (default), the class membership probabilities are estimated as the sample proportion of observations belonging to each class. Otherwise, prior should be a vector with the same length as the number of classes in y. The prior probabilities should be nonnegative and sum to one. The order of the prior probabilities is assumed to match the levels of factor(y).

hdrda object that contains the trained HDRDA classifier

list with predicted class and discriminant scores for each of the K classes

Ramey, J. A., Stein, C. K., and Young, D. M. (2017), "High-Dimensional Regularized Discriminant Analysis." https://arxiv.org/abs/1602.01182.

Friedman, J. H. (1989), "Regularized Discriminant Analysis," Journal of American Statistical Association, 84, 405, 165-175. http://www.jstor.org/pss/2289860 (Requires full-text access).

ramey/sparsediscrim documentation built on May 26, 2019, 10:05 p.m.