Description Usage Arguments Details Value References
Given a set of training data, this function builds the HDRDA classifier from Ramey, Stein, and Young (2017). Specially designed for small-sample, high-dimensional data, the HDRDA classifier incorporates dimension reduction and covariance-matrix shrinkage to enable a computationally efficient classifier.
For a given rda_high_dim
object, we predict the class of each observation
(row) of the the matrix given in newdata
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | rda_high_dim(x, ...)
## Default S3 method:
rda_high_dim(
x,
y,
lambda = 1,
gamma = 0,
shrinkage_type = c("ridge", "convex"),
prior = NULL,
tol = 1e-06,
...
)
## S3 method for class 'formula'
rda_high_dim(formula, data, ...)
## S3 method for class 'rda_high_dim'
predict(
object,
newdata,
projected = FALSE,
type = c("class", "prob", "score"),
...
)
|
x |
Matrix or data frame containing the training data. The rows are the sample observations, and the columns are the features. Only complete data are retained. |
... |
additional arguments (not currently used). |
y |
vector of class labels for each training observation |
lambda |
the HDRDA pooling parameter. Must be between 0 and 1, inclusively. |
gamma |
a numeric values used for the shrinkage parameter. |
shrinkage_type |
the type of covariance-matrix shrinkage to apply. By
default, a ridge-like shrinkage is applied. If |
prior |
vector with prior probabilities for each class. If |
tol |
a threshold for determining nonzero eigenvalues. |
formula |
A formula of the form |
data |
data frame from which variables specified in |
object |
Object of type |
newdata |
Matrix or data frame of observations to predict. Each row corresponds to a new observation. |
projected |
logical indicating whether |
type |
Prediction type: either |
The HDRDA classifier utilizes a covariance-matrix estimator that is a convex
combination of the covariance-matrix estimators used in the Linear
Discriminant Analysis (LDA) and Quadratic Discriminant Analysis (QDA)
classifiers. For each of the K
classes given in y
,
(k = 1, …, K), we first define this convex combination as
\hat{Σ}_k(λ) = (1 - λ) \hat{Σ}_k + λ \hat{Σ},
where λ \in [0, 1] is the pooling parameter. We then calculate the covariance-matrix estimator
\tilde{Σ}_k = α_k \hat{Σ}_k(λ) + γ I_p,
where I_p is the p \times p identity matrix. The matrix \tilde{Σ}_k is substituted into the HDRDA classifier. See Ramey et al. (2017) for more details.
The matrix of training observations are given in x
. The rows of
x
contain the sample observations, and the columns contain the features
for each training observation. The vector of class labels given in y
are coerced to a factor
. The length of y
should match the number
of rows in x
.
The vector prior
contains the a priori class membership for
each class. If prior
is NULL
(default), the class membership
probabilities are estimated as the sample proportion of observations
belonging to each class. Otherwise, prior
should be a vector with the
same length as the number of classes in y
. The prior
probabilities should be nonnegative and sum to one. The order of the prior
probabilities is assumed to match the levels of factor(y)
.
rda_high_dim
object that contains the trained HDRDA classifier
list with predicted class and discriminant scores for each of the K classes
Ramey, J. A., Stein, C. K., and Young, D. M. (2017), "High-Dimensional Regularized Discriminant Analysis." https://arxiv.org/abs/1602.01182.
Friedman, J. H. (1989), "Regularized Discriminant Analysis," Journal of American Statistical Association, 84, 405, 165-175. http://www.jstor.org/stable/2289860 (Requires full-text access).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.