classification: Multiclass classification
In AlineTalhouk/splendid: SuPervised Learning ENsemble for Diagnostic IDentification

classification

R Documentation

Multiclass classification

Description

Run a multiclass classification algorithm on a given dataset and reference class.

Usage

classification(
  data,
  class,
  algorithms,
  rfe = FALSE,
  ova = FALSE,
  standardize = FALSE,
  sampling = c("none", "up", "down", "smote"),
  seed_samp = NULL,
  sizes = NULL,
  trees = 100,
  tune = FALSE,
  seed_alg = NULL,
  convert = FALSE
)

Arguments

`data`	data frame with rows as samples, columns as features
`class`	true/reference class vector used for supervised learning
`algorithms`	character string of algorithm to use for supervised learning. See Algorithms section for possible options.
`rfe`	logical; if `TRUE`, run Recursive Feature Elimination as a feature selection method for "lda", "rf", and "svm" algorithms.
`ova`	logical; if `TRUE`, use the One-Vs-All approach for the `knn` algorithm.
`standardize`	logical; if `TRUE`, the training sets are standardized on features to have mean zero and unit variance. The test sets are standardized using the vectors of centers and standard deviations used in corresponding training sets.
`sampling`	the default is "none", in which no subsampling is performed. Other options include "up" (Up-sampling the minority class), "down" (Down-sampling the majority class), and "smote" (synthetic points for the minority class and down-sampling the majority class). Subsampling is only applicable to the training set.
`seed_samp`	random seed used for reproducibility in subsampling training sets for model generation
`sizes`	the range of sizes of features to test RFE algorithm
`trees`	number of trees to use in "rf" or boosting iterations (trees) in "adaboost"
`tune`	logical; if `TRUE`, algorithms with hyperparameters are tuned
`seed_alg`	random seed used for reproducibility when running algorithms with an intrinsic random element (random forests)
`convert`	logical; if `TRUE`, converts all categorical variables in `data` to dummy variables. Certain algorithms only work with such limitations (e.g. LDA).

Details

Some of the classification algorithms implemented use pre-defined values that specify settings and options while others need to tune hyperparameters. "multinom" and "nnet" use a maximum number of weights of 2000, in case data is high dimensional and classification is time-consuming. "nnet" also tunes the number of nodes (1-5) in the hidden layer. "pam" considers 100 thresholds when training, and uses a uniform prior. "adaboost" calls maboost::maboost() instead of adabag::boosting() for faster performance. As a result, we use the "entrop" option, which uses the KL-divergence method and mimics adaboost. However, "adaboost_m1" calls adabag::boosting() which supports hyperparameter tuning.

When alg = "knn", the return value is NULL because class::knn() does not output an intermediate model object. The modelling and prediction is performed in one step. However, the class attribute "knn" is still assigned to the result in order to call the respective prediction() method. An additional class "ova" is added if ova = TRUE.

Value

The model object from running the classification algorithm

Algorithms

The classification algorithms currently supported are:

Prediction Analysis for Microarrays ("pam")
Support Vector Machines ("svm")
Random Forests ("rf")
Linear Discriminant Analysis ("lda")
Shrinkage Linear Discriminant Analysis ("slda")
Shrinkage Diagonal Discriminant Analysis ("sdda")
Multinomial Logistic Regression using
- Generalized Linear Model with no penalization ("mlr_glm")
- GLM with LASSO penalty ("mlr_lasso")
- GLM with ridge penalty ("mlr_ridge")
- GLM with elastic net penalty ("mlr_enet")
- Neural Networks ("mlr_nnet")
Neural Networks ("nnet")
Naive Bayes ("nbayes")
Adaptive Boosting ("adaboost")
AdaBoost.M1 ("adaboost_m1")
Extreme Gradient Boosting ("xgboost")
K-Nearest Neighbours ("knn")

Author(s)

Derek Chiu

Examples

data(hgsc)
class <- attr(hgsc, "class.true")
classification(hgsc, class, "xgboost")

AlineTalhouk/splendid documentation built on Feb. 23, 2024, 9:37 p.m.

AlineTalhouk/splendid index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

AlineTalhouk/splendid
SuPervised Learning ENsemble for Diagnostic IDentification

classification: Multiclass classification
In AlineTalhouk/splendid: SuPervised Learning ENsemble for Diagnostic IDentification

Multiclass classification

Description

Usage

Arguments

Details

Value

Algorithms

Author(s)

Examples

Related to classification in AlineTalhouk/splendid...

R Package Documentation

Browse R Packages

We want your feedback!

AlineTalhouk/splendid SuPervised Learning ENsemble for Diagnostic IDentification

classification: Multiclass classification In AlineTalhouk/splendid: SuPervised Learning ENsemble for Diagnostic IDentification

Multiclass classification

Description

Usage

Arguments

Details

Value

Algorithms

Author(s)

Examples

Related to classification in AlineTalhouk/splendid...

R Package Documentation

Browse R Packages

We want your feedback!

AlineTalhouk/splendid
SuPervised Learning ENsemble for Diagnostic IDentification

classification: Multiclass classification
In AlineTalhouk/splendid: SuPervised Learning ENsemble for Diagnostic IDentification