ordASDA: Ordinal Accelerated Sparse Discriminant Analysis

ordASDAR Documentation

Ordinal Accelerated Sparse Discriminant Analysis

Description

Applies accelerated proximal gradient algorithm to the optimal scoring formulation of sparse discriminant analysis proposed by Clemmensen et al. 2011. The problem is further casted to a binary classification problem as described in "Learning to Classify Ordinal Data: The Data Replication Method" by Cardoso and da Costa to handle the ordinal labels. This function serves as a wrapper for the ASDA function, where the appropriate data augmentation is performed. Since the problem is casted into a binary classication problem, only a single discriminant vector comes from the result. The first *p* entries correspond to the variables/coefficients for the predictors, while the following K-1 entries correspond to biases for the found hyperplane, to separate the classes. The resulting object is of class ordASDA and has an accompanying predict function. The paper by Cardoso and dat Costa can be found here: (http://www.jmlr.org/papers/volume8/cardoso07a/cardoso07a.pdf).

Usage

ordASDA(Xt, ...)

## Default S3 method:
ordASDA(
  Xt,
  Yt,
  s = 1,
  Om,
  gam = 0.001,
  lam = 1e-06,
  method = "SDAAP",
  control,
  ...
)

Arguments

Xt

n by p data matrix, (can also be a data.frame that can be coerced to a matrix)

...

Additional arguments for ASDA and lda function in package MASS.

Yt

vector of length n, equal to the number of samples. The classes should be 1,2,...,K where K is the number of classes. Yt needs to be a numeric vector.

s

We need to find a hyperplane that separates all classes with different biases. For each new bias we define a binary classification problem, where a maximum of s ordinal classes or contained in each of the two classes. A higher value of s means that more data will be copied in the data augmentation step. BY default s is 1.

Om

p by p parameter matrix Omega in generalized elastic net penalty, where p is the number of variables.

gam

Regularization parameter for elastic net penalty, must be greater than zero.

lam

Regularization parameter for l1 penalty, must be greater than zero.

method

String to select method, now either SDAD or SDAAP, see ?ASDA for more info.

control

List of control arguments further passed to ASDA. See ASDA.

Value

ordASDA returns an object of class "ordASDA" including a list with the same components as an ASDA objects and:

h

Scalar value for biases.

K

Number of classes.

NULL

Note

Remember to normalize the data.

See Also

ASDA.

Examples

    set.seed(123)

    # You can play around with these values to generate some 2D data to test one
    numClasses <- 5
    sigma <- matrix(c(1,-0.2,-0.2,1),2,2)
    mu <- c(0,0)
    numObsPerClass <- 5

    # Generate the data, can access with train$X and train$Y
    train <- accSDA::genDat(numClasses,numObsPerClass,mu,sigma)
    test <- accSDA::genDat(numClasses,numObsPerClass*2,mu,sigma)

    # Visualize it, only using the first variable gives very good separation
    plot(train$X[,1],train$X[,2],col = factor(train$Y),asp=1,main="Training Data")

    # Train the ordinal based model
    res <- accSDA::ordASDA(train$X,train$Y,s=2,h=1, gam=1e-6, lam=1e-3)
    vals <- predict(object = res,newdata = test$X) # Takes a while to run ~ 10 seconds
    sum(vals==test$Y)/length(vals) # Get accuracy on test set
    #plot(test$X[,1],test$X[,2],col = factor(test$Y),asp=1,
    #      main="Test Data with correct labels")
    #plot(test$X[,1],test$X[,2],col = factor(vals),asp=1,
    #    main="Test Data with predictions from ordinal classifier")


gumeo/accSDA documentation built on Nov. 16, 2023, 11:47 p.m.