calc_discrimAnalysis_args: Calculate Discriminant Analysis - Arguments

calc_discrimAnalysis_argsR Documentation

Calculate Discriminant Analysis - Arguments

Description

The following parameters can be used in the ... argument in function getap, also within function gdmm, to override the values in the analysis procedure file and so to modify the calculation of discriminant analysis models - see examples.

getap(...)
gdmm(dataset, ap=getap(...))

Arguments

do.da

Logical. If used in getap, if classification via discriminant analysis (lda, qda, fda, MclustDA) should be performed in the given dataset.

da.type

Character vector. The type of discriminant analysis (DA) to perform; possible values (one or more) are: 'lda', 'qda', 'fda', 'mclustda':

  • lda Linear DA using lda.

  • qda Quadratic DA using qda.

  • fda Flexible DA using fda.

  • mclustda DA based on Gaussian finite mixture modeling using MclustDA.

da.classOn

Character vector. One or more class variables to define the grouping used for classification.

da.testCV

Logical, if the errors of the test-data should be crossvalidated. If set to true, CV and testing is repeated in alternating datasets. See below.

da.percTest

Numeric length one. The percentage of the dataset that should be set aside for testing the models; these data are never seen during training and crossvalidation.

da.cvBootCutoff

The minimum number of observations (W) that should be in the smallest subgroup (as defined by the classification grouping variable) *AFTER* the split into da.valid crossvalidation segments (below). If W is equal or higher than da.cvBootCutoff, the crossvalidation is done via splitting the training data in da.valid (see below) segments, otherwise the crossvalidation is done via bootstrap resampling, with the number of bootstrap iterations resulting from the multiplication of the number of observations in this smallest subgroup (as defined by the classification grouping variable) in *all* of the training data with da.cvBootFactor. To never perform the CV of the training data via bootstrap, set the parameter cl_gen_neverBootstrapForCV in the settings.r file to TRUE. An example: With da.cvBootCutoff set to 15 and a 8-fold crossvalidation da.valid <- 8, the required minimum number of observations in the smallest subgroup *after* the split in 8 segments would be 15, and in all the training data to perform the desired 8-fold CV would be (8x15=) 120, in what case then 8 times 15 observations will form the test data to be projected into models made from (120-15=) 105 observations. If there would be less than 120 observations, lets say for example, only 100 observations in the smallest group as defined by the classification grouping variable, bootstrap resampling with da.cvBootFactor * 100 iterations would be performed. In this example, if we would also be satisfied with a 5-fold crossvalidation, then we would have enough data: 100 / 5 = 20, and with the da.cvBootCutoff value being 15, the 5-fold crossvalidation would be performed.

da.cvBootFactor

The factor used to multiply the number of observations within the smallest subgroup defined by the classification grouping variable with, resulting in the number of iterations of a possible bootstrap crossvalidation of the trainign data – see .cvBootCutoff.

da.valid

The number of segments the training data should be divided into in case of a "traditional" crossvalidation of the training data; see above.

da.pcaRed

Logical, if variable reduction via PCA should be applied; if TRUE, the subsequent classifications are performed on the PCA scores, see da.pcaNComp below.

da.pcaNComp

Character or integer vector. Provide the character "max" to use the maximum number of components (i.e. the number of observations minus 1), or an integer vector specifying the components resp. their scores to be used for DA.

Details

For a list of all parameters that can be used in the ... argument in getap and in the plot functions please see anproc_file.

See Also

gdmm, siWlg for reducing the number of wavelengths in a dataset

Other Calc. arguments: calc_NNET_args, calc_SVM_args, calc_aqg_args, calc_pca_args, calc_pls_args, calc_randomForest_args, calc_sim_args, split_dataset

Other Classification functions: calc_NNET_args, calc_SVM_args, calc_randomForest_args, plot_classifX_indepPred()

Other DA documentation: plot_da,aquap_cube-method, plot_discrimAnalysis_args


bpollner/aquap2 documentation built on June 29, 2024, 5:21 p.m.