Description Arguments Details Important See Also
The analysis procedure file is used to first split the dataset
according to the provided values in the 'split dataset' section, and
then, in the 'statistics' section (starting with do.pca
), to tell
the system which statistics to apply resp. what models to calculate on those
datasets. It also contains specific and general plotting options that are used
by the plot
function.
Arguments used to control the splitprocess, the behaviour of statistics /
calculations / specific plotting options and the general plotting options
start with a certain prefix:
"spl" for all arguments related to the splitprocess.
(For a separate listing please see split_dataset
)
"pca" for all arguments related to PCA models (except do.pca).
(For a separate listing see calc_pca_args
and
plot_pca_args
)
"sim" for all arguments related to SIMCA models (except do.sim).
(For a separate listing see calc_sim_args
and
plot_sim_args
)
"pls" for all arguments related to PLSR models (except do.pls).
(For a separate listing see calc_pls_args
and
plot_pls_args
)
"aqg" for all arguments related to Aquagrams (except do.aqg).
(For a separate listing see calc_aqg_args
and
plot_aqg_args
)
"da" for all arguments related to Discriminant Analysis classification
(except do.da). (For a separate listing see
calc_discrimAnalysis_args
and
plot_discrimAnalysis_args
)
"rnf" for all arguments related to RandomForest classification
(except do.rnf). (For a separate listing see
calc_randomForest_args
and
plot_randomForest_args
)
"svm" for all arguments related to Support Vector Machines
classification (except do.svm). (For a separate listing see
calc_SVM_args
and plot_SVM_args
)
"nnet" for all arguments related to Neural Networks classification
(except do.aqg). (For a separate listing see calc_NNET_args
and plot_NNET_args
)
"pg" for the general plotting options that are used in each of the
plotting functions. (For a separate listing see
plot_pg_args
)
By providing any of the arguments of the analysis procedure file to the
function getap
, also when using it inside the function
gdmm
, you can override the values in the file with the
provided values. See examples at gdmm
.
spl.var 
NULL or character vector. If NULL, no splitting of the dataset will be performed. Provide a character vector with the column names of class variables to split the dataset along these variables. 
spl.wl 
NULL or character vector. If NULL, all in the dataset available wavelengths will be used. Provide a character vector in the format "wlFromtowlTo" (e.g. c("1000to2000", "1300to1600", ...)) to use all previously defined splits in these wavelengths. 
dpt.pre 
Character vector, which of the available modules of data
pretreatments to apply AFTER a (possible) split by variable

spl.do.csAvg 
Logical. If all the consecutive scans of a single sample should be reduced, i.e. averaged into a single spectrum. 
spl.csAvg.raw 
Logical. If, should the consecutive scans of a single sample be reduced, an other dataset containing every single consecutive scan should be kept as well as well. 
spl.do.noise 
Logical. If artifical noise should be added to the dataset. 
spl.noise.raw 
If, should the noisetest be performed, the raw data will be used as well in addition to the noisedata. 
spl.do.exOut 
Logical. If exclusion of outliers should be performed. 
spl.exOut.raw 
Logical. If, should exclusion of outliers be performed, the raw original data should be used as well. If set to TRUE, outliers will be flagged in the dataset in any case. 
spl.exOut.var 
Character vector. The variables that should be used
for the grouping defining the scope for outlier detection. The name of the
resulting column consists of the class variable prefix (as defined in the
settings.r file in 
dpt.post 
Character vector, which of the available modules of data
pretreatments to apply AFTER (possibly) splitting the dataset. Leave
at NULL for no additional data treatment. Possible values are
'sgol', 'snv', 'msc', 'emsc', 'osc', 'deTr', 'gsd'. Add additional parameters to some of the
single strings via the separator '@'. For examples and further information
see 
do.pca 
Logical. If used in 
do.pca 
Logical. If used in a plotting function, if PCA score / loading plots should be plotted. 
pca.colorBy 
NULL or character vector. Which classvariables should be used for coloring the PCA score plot. Set to NULL for using all available class variables for coloring. 
pca.elci 
'def' or numeric length one. The confidence interval for the ellipse to be drawn around groups in score plots. Leave at 'def' to read in the default from the settings.r file; provide a numeric length one (e.g. 0.95); or set to NULL for not drawing ellipses at all. 
pca.elcolorBy 
Character vector or NULL. The variables to use for
plotting additional confidence intervall ellipses. Set to NULL for *not*
drawing additional CIellipses. Provide one variable (gets recycled) or a
vector with equal length as 
pca.what 
Character length one. What element of the PCA analysis to plot. Possible values are 'both', 'scores', 'loadings'. 
pca.sc 
Numeric length 2. Two PCs to be plotted against each other in the score plots. 
pca.sc.pairs 
Numeric vector of length >=2, indicating what PCs to plot in the score pairs plot. Set to NULL for *not* plotting the pairs plot. 
pca.lo 
Numeric vector of length >=2, indicating what PCs to plot in the loadingplot. 
do.sim 
Logical. If used in 
sim.vars 
NULL or character vector. Which variables should be used to group the data. Set to NULL for using all available classvariables, or provide a character vector with the column names of class variables to group the data along those for calculating SIMCA models. 
sim.K 
Numeric length one. The number of components used for calculating the SIMCA models. In mode 'robust' leave at '0' for automatic detection of optimal number of components. [It is a capital 'K' in the argument.] 
do.sim 
Logical. If used in a plotting function, if analysis of SIMCA models should be plotted. 
do.pls 
Logical. If used in 
pls.regOn 
NULL or character vector. Which variables should be used to regress on. Set to NULL for using all numerical variables to regress on, or provide a character vector with the column names of numerical variables to use those for regression in the PLSR. 
pls.ncomp 
NULL or integer length one. The number of components used in PLSR. Set to NULL for automatic detection, or provide an integer to use this number of components in the PLSR. 
pls.valid 
Character. Which crossvalidation to use. Possible values are:
If a vector with the same length as the vector in 
pls.exOut 
Logical. If a plsrspecific boxplot based outlierdetection algorithm should be used on the data of a first plsr model to determine the outliers that then will be excluded in the final plsr model. Possible values are:
If a vector with the same length as the vector in 
do.pls 
Logical. If used in a plotting function, if analysis from PLSR models should be plotted. 
pls.colorBy 
NULL or character. What classvariable should be used for coloring in the RMSEC and RMSECV plots. Set to NULL for no coloring, or provide a character length one with a single column name of a class variable that should be used for coloring. 
pls.what 
What types of plsr analysis to plot. Possible values are 'both', 'errors', 'regression'. 
pls.rdp 
Logical (TRUE or FALSE). If errors in the error plots should be given in RDP or not. 
do.aqg 
Logical. If used in 
aqg.vars 
NULL or character vector. Which class variables should be used for grouping the data for the Aquagram. Provide a character vector with the column names of one or more class variables for grouping data and generate an Aquagram for every one of them. 
aqg.nrCorr 
Character or Logical. If the number of observations in each spectral pattern should be corrected (if necessary by random sampling) so that all the spectral pattern are calculated out from the same number of observations. If left at the default "def", the default value from the settings will be used. Provide "TRUE" or "FALSE" to switch number correction manually on or off. 
aqg.spectra 
Logical or Character. If left at "FALSE" (the default) no additional spectra are calculated / prepared for plotting. Other possible values are one or more of:

aqg.minus 
Character length one, character vector or NULL. Which of the
levels present in each of the classvariables provided in 
aqg.mod 
Character. What mode, what kind of Aquagram should be calculated?
Possible values are: 'classic', 'classicdiff', 'sfc', 'sfcdiff', 'aucs', 'aucsdiff', 'aucs.tn', 'aucs.tndiff', 'aucs.tn.dce', 'aucs.tn.dcediff', 'aucs.dce', 'aucs.dcediff', and 'def' for reading in
the default from settings.r. Please see 
aqg.TCalib 
Character, numeric or NULL. The default (leave at 'def') can be
set in the settings. If 'NULL' the complete temperature range of the
calibration data is used for calibration. Provide a numeric length two
[c(x1, x2)] for manually determining the calibration range. Provide a
character '[email protected]', with 'x' being the plus and minus delta in temperature
from the temperature of the experiment for having a calibration range from
Texpx to Texp+x. The 'Factory' default is '[email protected]'.
Applies to all modes except the 'classic' and 'sfc' modes.
If, in any of the modes showing percentages, the numbers on the
Aquagram are below 0 or above 100, then the calibration range has to be
extended. To record your own temperature calibration spectra, please see

aqg.Texp 
Numeric length one. The temperature at which the
spectra were taken. The default (leave at 'def') can be set in the settings.
Please see also 
aqg.bootCI 
Logical. If confidence intervalls for the selected wavelengths should be calculated within each group (using bootstrap). Leave at 'def' for getting the default from the settings. 
aqg.R 
Character or numeric. Given aqg.bootCI = TRUE, how many bootstrap replicates should be performed? Leave at 'def' for choossing the default from the settings, where the factorydefault is "[email protected]" for for 3 x nrow(samples). By manually providing a character in the form of '[email protected]' where x is any number, you can set the factor with which the number of rows get multiplicated, the result of this multiplication is then used for the number of bootstrap replicates. By providing a length one numeric you can directly set the number of bootstrap replicates. 
aqg.smoothN 
Only used in the 'classic' and 'sfc' modes. Numeric length 1. Must be odd. Smoothing points for the Sav. Golay smoothing that is applied before making the calculations. Change to NULL or anything notnumeric to switch off smoothing. 
aqg.selWls 
Only used in the 'classic' and 'sfc' modes. Numerical vector. If provided and in the mode "classic", classicdiff", "sfc" and "sfcdiff" these numbers will be used to determine the coordinates of the aquagram. Leave at 'def' to use the defaults from the settings file. 
aqg.msc 
Only used in the 'classic' and 'sfc' modes. Logical. If MSC should be performed. 
aqg.reference 
Only used in the 'classic' and 'sfc' modes. An optional numerical vector (loadings, etc..) used for MSC. 
do.aqg 
Logical. If used in a plotting function, if Aquagrams should be plotted. 
aqg.fsa 
'Fix scale for Aquagram'. Logical, numeric or Character. If left at the default logical FALSE, every single aquagram will be plotted in its own, independent scale. If a numeric vector length two is provided, all the aquagrams to be plotted (normal AND bootstrapped ones) will be in the provided range, no independently scaled aquagrams will be plotted. If character, the following values are possible:

aqg.fss 
'Fix scale for subtraction spectra'. Logical, numeric or character. If left at the default logical FALSE', every single subtractionspectra plot will be plotted in its own, independendent scale. If a numeric vector length two is provided, all the subtractionspectra to be plotted (if 'plotSpectra' contains 'subtr', and 'minus' contains a valid value) will be in the provided range, no independently scaled subtractionspectra will be plotted. If character, the following values are possible:

aqg.ccol 
Custom Color  NULL, Numeric or Character vector. Custom colors for drawing the lines in the aquagram. Length must exactly match the number of groups to be plotted in the aquagram. If not, the default coloring from the dataset is used. This can be used when plotting aquagrams with different numbers of groups: only this group that matches the number of provided custom colors is colored differently. Especially useful when you have more than 8 lines to be plotted – customcolor similar groups in similar colors. 
aqg.clt 
Character or Integer vector. Custom line type for plotting the lines in the Aquagram. If left at the default 'def', the vector provided in the settings.r file is taken (and recycled). If an integer vector is provided, this is used (and recycled) as linetypes in the Aquagram. 
aqg.pplot 
Logical or character 'def'. If, should spectra be plotted, an additional plot with picked peaks should be added. If left at the default value 'def', the default from the settings.r file is used. 
aqg.plines 
Logical, numeric or character 'def'. If set to 
aqg.discr 
Logical or character 'def'. If set to TRUE, negative (resp. positive) peaks can be only found in peakheights below (resp. above) zero. 
do.da 
Logical. If used in 
da.type 
Character vector. The type of discriminant analysis (DA) to
perform; possible values (one or more) are:

da.classOn 
Character vector. One or more class variables to define the grouping used for classification. 
da.testCV 
Logical, if the errors of the testdata should be crossvalidated. If set to true, CV and testing is repeated in alternating datasets. See below. 
da.percTest 
Numeric length one. The percentage of the dataset that should be set aside for testing the models; these data are never seen during training and crossvalidation. 
da.cvBootCutoff 
The minimum number of observations (W) that should be
in the smallest subgroup (as defined by the classification grouping variable)
*AFTER* the split into 
da.cvBootFactor 
The factor used to multiply the number of observations
within the smallest subgroup defined by the classification grouping variable
with, resulting in the number of iterations of a possible bootstrap
crossvalidation of the trainign data – see 
da.valid 
The number of segments the training data should be divided into in case of a "traditional" crossvalidation of the training data; see above. 
da.pcaRed 
Logical, if variable reduction via PCA should be applied; if
TRUE, the subsequent classifications are performed on the PCA scores, see

da.pcaNComp 
Character or integer vector. Provide the character "max" to use the maximum number of components (i.e. the number of observations minus 1), or an integer vector specifying the components resp. their scores to be used for DA. 
reserved 
– No plotting parameter yet defined – 
do.rnf 
Logical. If used in 
rnf.classOn 
Character vector. One or more class variables to define the grouping used for classification. 
rnf.testCV 
Logical, if the errors of the testdata should be crossvalidated. If set to true, CV and testing is repeated in alternating datasets. See below. 
rnf.percTest 
Numeric length one. The percentage of the dataset that should be set aside for testing the models; these data are never seen during training and crossvalidation. 
rnf.cvBootCutoff 
The minimum number of observations (W) that should be
in the smallest subgroup (as defined by the classification grouping variable)
*AFTER* the split into 
rnf.cvBootFactor 
The factor used to multiply the number of observations
within the smallest subgroup defined by the classification grouping variable
with, resulting in the number of iterations of a possible bootstrap
crossvalidation of the trainign data – see 
rnf.valid 
The number of segments the training data should be divided into in case of a "traditional" crossvalidation of the training data; see above. 
rnf.pcaRed 
Logical, if variable reduction via PCA should be applied; if
TRUE, the subsequent classifications are performed on the PCA scores, see

rnf.pcaNComp 
Character or integer vector. Provide the character "max" to use the maximum number of components (i.e. the number of observations minus 1), or an integer vector specifying the components resp. their scores to be used for random forest classification. 
reserved 
– No plotting parameter yet defined – 
do.svm 
Logical. If used in 
svm.classOn 
Character vector. One or more class variables to define the grouping used for classification. 
svm.testCV 
Logical, if the errors of the testdata should be crossvalidated. If set to true, CV and testing is repeated in alternating datasets. See below. 
svm.percTest 
Numeric length one. The percentage of the dataset that should be set aside for testing the models; these data are never seen during training and crossvalidation. 
svm.cvBootCutoff 
The minimum number of observations (W) that should be
in the smallest subgroup (as defined by the classification grouping variable)
*AFTER* the split into 
svm.cvBootFactor 
The factor used to multiply the number of observations
within the smallest subgroup defined by the classification grouping variable
with, resulting in the number of iterations of a possible bootstrap
crossvalidation of the trainign data – see 
svm.valid 
The number of segments the training data should be divided into in case of a "traditional" crossvalidation of the training data; see above. 
svm.pcaRed 
Logical, if variable reduction via PCA should be applied; if
TRUE, the subsequent classifications are performed on the PCA scores, see

svm.pcaNComp 
Character or integer vector. Provide the character "max" to use the maximum number of components (i.e. the number of observations minus 1), or an integer vector specifying the components resp. their scores to be used for SVM classification. 
reserved 
– No plotting parameter yet defined – 
do.nnet 
Logical. If used in 
nnet.classOn 
Character vector. One or more class variables to define the grouping used for classification. 
nnet.testCV 
Logical, if the errors of the testdata should be crossvalidated. If set to true, CV and testing is repeated in alternating datasets. See below. 
nnet.percTest 
Numeric length one. The percentage of the dataset that should be set aside for testing the models; these data are never seen during training and crossvalidation. 
nnet.cvBootCutoff 
The minimum number of observations (W) that should be
in the smallest subgroup (as defined by the classification grouping variable)
*AFTER* the split into 
nnet.cvBootFactor 
The factor used to multiply the number of observations
within the smallest subgroup defined by the classification grouping variable
with, resulting in the number of iterations of a possible bootstrap
crossvalidation of the trainign data – see 
nnet.valid 
The number of segments the training data should be divided into in case of a "traditional" crossvalidation of the training data; see above. 
nnet.pcaRed 
Logical, if variable reduction via PCA should be applied; if
TRUE, the subsequent classifications are performed on the PCA scores, see

nnet.pcaNComp 
Character or integer vector. Provide the character "max" to use the maximum number of components (i.e. the number of observations minus 1), or an integer vector specifying the components resp. their scores to be used for nnet classification. 
reserved 
– No plotting parameter yet defined – 
pg.where 
Character length one. If left at the default 'def', the value
from the settings.r file is read in (parameter 
pg.main 
Character length one. The additional text on the title of each single plot. 
pg.sub 
Character length one. The additional text on the subtitle of each single plot. 
pg.fns 
Character length one. The additional text in the filename of the pdf. 
The default name for the analysis procedure file can be set in
settings.r. Any other .r file can be loaded by providing a valid .r filename
to the appropriate argument, e.g. in the function getap
.
By providing any of the arguments of the analysis procedure file to the
function getap
also when using it inside the function
gdmm
or to any of the plot
functions, you can
override the values in the file with the provided values. See examples at
gdmm
and plot
.
As the AUCmods of the Aquagram compare the actual data to
your previously recoreded temperature calibration data (see
genTempCalibExp
and tempCalib_procedures
), the
application of some datatreatment functions (see e.g. do_gapDer
)
can lead to unexpected and distorted results in the Aquagram.
Other fileDocs: metadata_file
,
settings_file
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.