ASDA | R Documentation |

Applies accelerated proximal gradient algorithm, proximal gradient algorithm or alternating direction methods of multipliers algorithm to the optimal scoring formulation of sparse discriminant analysis proposed by Clemmensen et al. 2011.

`argmin{|(Y_t\theta-X_t\beta)|_2^2 + t|\beta|_1 + \lambda|\beta|_2^2}`

```
ASDA(Xt, ...)
## Default S3 method:
ASDA(
Xt,
Yt,
Om = diag(p),
gam = 0.001,
lam = 1e-06,
q = K - 1,
method = "SDAAP",
control = list(),
...
)
```

`Xt` |
n by p data matrix, (can also be a data.frame that can be coerced to a matrix) |

`...` |
Additional arguments for |

`Yt` |
n by K matrix of indicator variables (Yij = 1 if i in class j). This will later be changed to handle factor variables as well. Each observation belongs in a single class, so for a given row/observation, only one element is 1 and the rest is 0. |

`Om` |
p by p parameter matrix Omega in generalized elastic net penalty. |

`gam` |
Regularization parameter for elastic net penalty. |

`lam` |
Regularization parameter for l1 penalty, must be greater than zero.
If cross-validation is used ( |

`q` |
Desired number of discriminant vectors. |

`method` |
This parameter selects which optimization method to use. It is specified as a character vector which can be one of the three values `SDAP` Proximal gradient algorithm. `SDAAP` Accelerated proximal gradient algorithm. `SDAD` Alternating directions method of multipliers algorithm.
Note that further parameters are passed to the function in the argument |

`control` |
List of control arguments. See Details. |

The control list contains the following entries to further tune the algorithms.

`PGsteps`

Maximum number if inner proximal gradient/ADMM algorithm for finding beta. Default value is 1000.

`PGtol`

Stopping tolerance for inner method. If the method is

`SDAD`

, then this must be a vector of two values, absolute (first element) and relative tolerance (second element). Default value is 1e-5 for both absolute and relative tolerances.`maxits`

Number of iterations to run. Default value is 250.

`tol`

Stopping tolerance. Default value is 1e-3.

`mu`

Penalty parameter for augmented Lagrangian term, must be greater than zero and only needs to be specified when using method

`SDAD`

. Default value is 1.`CV`

Logical value which is

`TRUE`

if cross validation is supposed to be performed. If cross-validation is performed, then lam should be specified as a vector containing the regularization values to be tested. Default value is`FALSE`

.`folds`

Integer determining the number of folds in cross-validation. Not needed if CV is not specified. Default value is 5.

`feat`

Maximum fraction of nonzero features desired in validation scheme. Not needed if CV is not specified. Default value is 0.15.

`quiet`

Set to

`FALSE`

if status updates are supposed to be printed to the R console. Default value is`TRUE`

. Note that this triggers a lot of printing to the console.`ordinal`

Set to

`TRUE`

if the labels are ordinal. Only available for methods`SDAAP`

and`SDAD`

.`initTheta`

Option to set the initial theta vector, by default it is a vector of all ones for the first theta.

`bt`

Logical indicating whether backtracking should be used, only applies to the Proximal Gradient based methods. By default, backtracking is not used.

`L`

Initial estimate for Lipshitz constant used for backtracking. Default value is 0.25.

`eta`

Scalar for Lipshitz constant. Default value is 1.25.

`rankRed`

Boolean indicating whether Om is factorized, such that R^t*R=Om, currently only applicable for accelerated proximal gradient.

`ASDA`

returns an object of `class`

"`ASDA`

" including a list
with the following named components:

`call`

The matched call.

`B`

p by q matrix of discriminant vectors, i.e. sparse loadings.

`Q`

K by q matrix of scoring vectors, i.e. optimal scores.

`varNames`

Names of the predictors used, i.e. column names of Xt.

`origP`

Number of variables in Xt.

`fit`

Output from function

`lda`

on projected data. This is`NULL`

the trivial solution is found, i.e. B is all zeroes. Use lower values of`lam`

if that is the case.`classes`

The classes in Yt.

`lambda`

The lambda/

`lam`

used, best value found by cross- validation if`CV`

is`TRUE`

.

`NULL`

The input matrix Xt should be normalized, i.e. each column corresponding to
a variable should have its mean subtracted and scaled to unit length. The functions
`normalize`

and `normalizetest`

are supplied for this purpose in the package.

`SDAAP`

, `SDAP`

and `SDAD`

```
set.seed(123)
# Prepare training and test set
train <- c(1:40,51:90,101:140)
Xtrain <- iris[train,1:4]
nX <- normalize(Xtrain)
Xtrain <- nX$Xc
Ytrain <- iris[train,5]
Xtest <- iris[-train,1:4]
Xtest <- normalizetest(Xtest,nX)
Ytest <- iris[-train,5]
# Define parameters for Alternating Direction Method of Multipliers (SDAD)
Om <- diag(4)+0.1*matrix(1,4,4) #elNet coef mat
gam <- 0.0001
lam <- 0.0001
method <- "SDAD"
q <- 2
control <- list(PGsteps = 100,
PGtol = c(1e-5,1e-5),
mu = 1,
maxits = 100,
tol = 1e-3,
quiet = FALSE)
# Run the algorithm
res <- ASDA(Xt = Xtrain,
Yt = Ytrain,
Om = Om,
gam = gam ,
lam = lam,
q = q,
method = method,
control = control)
# Can also just use the defaults, which is Accelerated Proximal Gradient (SDAAP):
resDef <- ASDA(Xtrain,Ytrain)
# Some example on simulated data
# Generate Gaussian data on three classes with plenty of redundant variables
# This example shows the basic steps on how to apply this to data, i.e.:
# 1) Setup training data
# 2) Normalize
# 3) Train
# 4) Predict
# 5) Plot projected data
# 6) Accuracy on test set
P <- 300 # Number of variables
N <- 50 # Number of samples per class
# Mean for classes, they are zero everywhere except the first 3 coordinates
m1 <- rep(0,P)
m1[1] <- 3
m2 <- rep(0,P)
m2[2] <- 3
m3 <- rep(0,P)
m3[3] <- 3
# Sample dummy data
Xtrain <- rbind(MASS::mvrnorm(n=N,mu = m1, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m2, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m3, Sigma = diag(P)))
Xtest <- rbind(MASS::mvrnorm(n=N,mu = m1, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m2, Sigma = diag(P)),
MASS::mvrnorm(n=N,mu = m3, Sigma = diag(P)))
# Generate the labels
Ytrain <- factor(rep(1:3,each=N))
Ytest <- Ytrain
# Normalize the data
Xt <- accSDA::normalize(Xtrain)
Xtrain <- Xt$Xc # Use the centered and scaled data
Xtest <- accSDA::normalizetest(Xtest,Xt)
# Train the classifier and increase the sparsity parameter from the default
# so we penalize more for non-sparse solutions.
res <- accSDA::ASDA(Xtrain,Ytrain,lam=0.01)
# Plot the projected training data, it is projected to
# 2-dimension because we have 3 classes. The number of discriminant
# vectors is maximum number of classes minus 1.
XtrainProjected <- Xtrain%*%res$beta
plot(XtrainProjected[,1],XtrainProjected[,2],col=Ytrain)
# Predict on the test data
preds <- predict(res, newdata = Xtest)
# Plot projected test data with predicted and correct labels
XtestProjected <- Xtest%*%res$beta
plot(XtestProjected[,1],XtestProjected[,2],col=Ytest,
main="Projected test data with original labels")
plot(XtestProjected[,1],XtestProjected[,2],col=preds$class,
main="Projected test data with predicted labels")
# Calculate accuracy
sum(preds$class == Ytest)/(3*N) # We have N samples per class, so total 3*N
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.