s_SDA: Sparse Linear Discriminant Analysis
In egenn/rtemis: Machine Learning and Visualization

s_SDA

R Documentation

Sparse Linear Discriminant Analysis

Description

Train an SDA Classifier using sparseLDA::sda

Usage

s_SDA(
  x,
  y = NULL,
  x.test = NULL,
  y.test = NULL,
  lambda = 1e-06,
  stop = NULL,
  maxIte = 100,
  Q = NULL,
  tol = 1e-06,
  .preprocess = setup.preprocess(scale = TRUE, center = TRUE),
  upsample = TRUE,
  downsample = FALSE,
  resample.seed = NULL,
  x.name = NULL,
  y.name = NULL,
  grid.resample.params = setup.resample("kfold", 5),
  gridsearch.type = c("exhaustive", "randomized"),
  gridsearch.randomized.p = 0.1,
  metric = NULL,
  maximize = NULL,
  print.plot = FALSE,
  plot.fitted = NULL,
  plot.predicted = NULL,
  plot.theme = rtTheme,
  question = NULL,
  verbose = TRUE,
  grid.verbose = verbose,
  trace = 0,
  outdir = NULL,
  n.cores = rtCores,
  save.mod = ifelse(!is.null(outdir), TRUE, FALSE)
)

Arguments

`x`	Numeric vector or matrix / data frame of features i.e. independent variables
`y`	Numeric vector of outcome, i.e. dependent variable
`x.test`	Numeric vector or matrix / data frame of testing set features Columns must correspond to columns in `x`
`y.test`	Numeric vector of testing set outcome
`lambda`	L2-norm weight for elastic net regression
`stop`	If STOP is negative, its absolute value corresponds to the desired number of variables. If STOP is positive, it corresponds to an upper bound on the L1-norm of the b coefficients. There is a one to one correspondence between stop and t. The default is -p (-the number of variables).
`maxIte`	Integer: Maximum number of iterations
`Q`	Integer: Number of components
`tol`	Numeric: Tolerance for change in RSS, which is the stopping criterion
`.preprocess`	List of preprocessing parameters. Scaling and centering is enabled by default, because it is crucial for algorithm to learn.
`upsample`	Logical: If TRUE, upsample cases to balance outcome classes (for Classification only) Note: upsample will randomly sample with replacement if the length of the majority class is more than double the length of the class you are upsampling, thereby introducing randomness
`downsample`	Logical: If TRUE, downsample majority class to match size of minority class
`resample.seed`	Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed)
`x.name`	Character: Name for feature set
`y.name`	Character: Name for outcome
`grid.resample.params`	List: Output of setup.resample defining grid search parameters.
`gridsearch.type`	Character: Type of grid search to perform: "exhaustive" or "randomized".
`gridsearch.randomized.p`	Float (0, 1): If `gridsearch.type = "randomized"`, randomly test this proportion of combinations.
`metric`	Character: Metric to minimize, or maximize if `maximize = TRUE` during grid search. Default = NULL, which results in "Balanced Accuracy" for Classification, "MSE" for Regression, and "Coherence" for Survival Analysis.
`maximize`	Logical: If TRUE, `metric` will be maximized if grid search is run.
`print.plot`	Logical: if TRUE, produce plot using `mplot3` Takes precedence over `plot.fitted` and `plot.predicted`.
`plot.fitted`	Logical: if TRUE, plot True (y) vs Fitted
`plot.predicted`	Logical: if TRUE, plot True (y.test) vs Predicted. Requires `x.test` and `y.test`
`plot.theme`	Character: "zero", "dark", "box", "darkbox"
`question`	Character: the question you are attempting to answer with this model, in plain language.
`verbose`	Logical: If TRUE, print summary to screen.
`grid.verbose`	Logical: Passed to `gridSearchLearn`
`trace`	Integer: passed to `sparseLDA::sda`
`outdir`	Path to output directory. If defined, will save Predicted vs. True plot, if available, as well as full model output, if `save.mod` is TRUE
`n.cores`	Integer: Number of cores to use.
`save.mod`	Logical: If TRUE, save all output to an RDS file in `outdir` `save.mod` is TRUE by default if an `outdir` is defined. If set to TRUE, and no `outdir` is defined, outdir defaults to `paste0("./s.", mod.name)`

Value

rtMod object

Author(s)

E.D. Gennatas

Examples

## Not run: 
datc2 <- iris[51:150, ]
datc2$Species <- factor(datc2$Species)
resc2 <- resample(datc2)
datc2_train <- datc2[resc2$Subsample_1, ]
datc2_test <- datc2[-resc2$Subsample_1, ]
# Without scaling or centering, fails to learn
mod_c2 <- s_SDA(datc2_train, datc2_test, .preprocess = NULL)
# Learns fine with default settings (scaling & centering)
mod_c2 <- s_SDA(datc2_train, datc2_test)

## End(Not run)

egenn/rtemis documentation built on Feb. 11, 2025, 5:17 a.m.