Description Usage Arguments Value Author(s) References Examples
View source: R/subdivideDataset.R
This function accepts spectra in a spectra.list
or
spectra.matrix
object and selects a subset of that dataset.
Importantly, the function can be set to select either a calibration or
validation subset. These are fundamentally different. When you select a
calibration dataset the intention is to choose a representative subset of
all spectral data on which to perform wet lab analysis. However, when
selecting a subset of samples (which already have wet lab analysis) in order
to validate a model, it is important that both the validation (test set) and
and calibration (training set) are representative–otherwise, the
calibration model will be fit to well sampled spectral space but validated
on outlying points. The calibration selection uses the Kennard-Stone
algorithm whereas the validation selection uses the Duplex algorithm, which
is a modification the original author's proposed. Finally, this function can
also perform calibration or validation selection in one of five distinct
methods (see the method parameter for details).
1 2 | subdivideDataset(spectra, component = NULL, type = "validation",
p = 0.2, method = "KS", seed.set = NULL, output = "logical")
|
spectra |
An object of class |
component |
Method "SPXY" and "MDKS" incorporate Y-value data in subset selection. If using one of these two methods, a vector of Y data should be provided here. |
type |
One of "calibration" or "validation" depending on the type of subset required. |
p |
The proportion of the dataset to select as the "calibration" or "validation" group. |
method |
The desired method. Selected from: |
seed.set |
A single numeric value. If method is "random" then you can set the seed so that the same selection is produced each time. |
output |
One of "logical" or "names." If "logical" then the function will return a logical vector where TRUE values are the selected samples. If "names" then the names of the selected spectra are returned. |
A vector. Depending on output
, either a logical of list of
names indicating selected spectra.
Daniel M Griffith
Kennard, R. W. and Stone, L. A. (1969) Computer aided design of experiments. Technometrics, 11, 137-148.
Galvao, R., Araujo, M., Jose, G., Pontes, M., Silva, E. & Saldanha, T. (2005). A method for calibration and validation subset partitioning. Talanta, 67, 736<e2><80><93>740.
Saptoro, Agus; Tad<c3><a9>, Moses O.; and Vuthaluru, Hari (2012) "A Modified Kennard-Stone Algorithm for Optimal Division of Data for Developing Artificial Neural Network Models," Chemical Product and Process Modeling: Vol. 7: Iss. 1, Article 13. DOI: 10.1515/1934-2659.1645
Snee, R.D., 1977. Validation of regression models: methods and examples. Technometrics, 19, 415-428.
1 2 3 4 5 6 | ## Not run:
data(shootout)
val_set <- subdivideDataset(spectra = shootout_scans, type = "validation", method = "KS")
table(val_set)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.