optimizePLS: Function to optimize the preprocessing steps for a PLS model.

Description Usage Arguments Details Value Author(s) Examples

View source: R/optimizePLS.R

Description

This function iterates all combinations of preprocessing steps and spectral region subsets, fitting PLS models for each, and ranks the resulting models according to the Root Mean Squared Error of Prediction (or CV for cross validation). This process informs the selection of parameters for fitting PLS models with the function calibrate(). The function also determines the rank (number of latent vectors) that is optimal for each model.

Usage

1
2
optimizePLS(component, spectra, training_set = NULL, parallel = FALSE,
  region_list = NULL, preprocessing_list = NULL, max_comps = 10)

Arguments

component

A vector of y-values. One for each spectrum.

spectra

An object of class spectra.matrix containing spectra. Rows should be in the same order as the component y-values.

training_set

A logical vector of length(component) specifying TRUE for training/calibration data and FALSE for test/validation set data.

parallel

Logical. The default is FALSE; TRUE allows for the parallelization of validation proceedures, using the number of available cores - 1. If FALSE the function will not be parallelized.

region_list

A list, where each element is a vector of length 2, specifying the range (max/min) of a spectral region to select. Should be in the same units as your spectra (e.g., wavenumbers).

preprocessing_list

A list, where each element is either (1) a single character string specifying a preprocessing step or (2) a vector of length 2 specifying a series of preprocessing steps to be applied together. See documentation for preprocess() for all available options.

max_comps

What is the maximum number of latent vectors to possibly include in the PLS regression. An integer.

Details

By default the spectral regions are defined in wavenumbers as list(c(9400,7500), c(7500,6100), c(6100,5450), c(5450,4600), c(4600,4250)).

Value

Returns an object of class PLSopt. The object is a list containing the optimization results. See the following:

optimization_results - a data.frame containing the RMSEP for each combination of preprocessing and subsetting tested
param_subsets - a list of regions tried
param_preproc - a list of preprecessing steps tried.

Author(s)

Daniel M Griffith

Examples

1
# See main leaf.spec-package example.

griffithdan/plantspec documentation built on Dec. 9, 2018, 1:26 a.m.