dpt_modules: Data pre-treatment modules

dpt_modulesR Documentation

Data pre-treatment modules


Apply one or more data pre-treatment modules to the dataset. The data pre-treatment (dpt.pre) can happen right after a (possible) wavelength-split and before a (possible) splitting of the dataset according to the provided split-variables below dpt.pre, being csAvg, noise and exOut, and / or after the splitting of the dataset (dpt.post), in other words as a final treatment to a dataset in each 'row' of the 'cube', i.e. in each 'cube-element'.



Transform the dataset using the Savitzy-Golay filter by calling do_sgolay (what in turn is relying on sgolay). Provide only the character "sgol" to use the standard values for p, n and m, what are 2, 21 and 0, respectively. Use a string in the format "sgol@p-n-m", with p,n,m being integers to modify the behaviour of do_sgolay by supplying your own values. Please note that 'n' has to be odd. The single integers have to be separated by a 'minus' ('-').


The filter order. Defaults to 2.


The filter length, must be odd. Defaults to 21.


Return the m-th derivative of the filter coefficients. Defaults to 0.


Transform the dataset using standard normal variation 'snv' by calling internally the function do_snv. No additional arguments can be provided here.


Transform the datset using multiplicative scatter correction 'msc' do_msc. If no reference is provided, the average of all spectra of a dataset is used as a reference for the baseline correction. Provide a dataset with a single spectrum in the format "msc@yourObj" with "yourObj" naming an existing object in your workspace containing an oject of class aquap_data containing only one row, (as e.g. produced by do_avg) to use this as a reference for baseline correction.


Format: "msc@yourObj", with 'yourObj' naming an existing object in your workspace containing an oject of class aquap_data with only one row. See also getcd for extracting a singel dataset from the 'cube' object, and do_avg for averaging a single dataset into a single spectrum.


Transform the datset using 'emsc'; internally the function do_emsc is called. You have to provide the name of an object containing a data frame or matrix with one or two loadings or with one regression vector in the format


Format: "emsc@yourObj", with 'yourObj' naming an existing object in your workspace containing a data frame or matrix with one or two loadings or with one regression vector. See also getcm for extracting single models from the 'cube' object.


Not yet implemented.


Transform the dataset by calling the de-Trend function do_detrend. Provide only the character "deTr" to use the standard values for src and trg which are 'NULL' and 'src', meaning that the whole wavelength range of the current dataset is used for calculating the de-trend values, which in turn also get applied to the whole range of wavelengths. Use a string in one of the following formats to modify the values for source and target in the de-trend function:


Format: "deTr@S1-S2", set the source wavelength-range for calculating the de-trend values from S1 to S2, with S1 and S2 being existing wavelengths in the current dataset, and leave the target at its default, i.e. use the same target as the source.


Format: "deTr@S1-S2-all", set the source wavelength from S1 to S2, with S1 and S2 being existing wavelengths in the current dataset, and apply the resulting de-trend to all of the wavelengths present in the current dataset.


Format: "deTr@S1-S2-T1-T2", set the source wavelengths from S1 to S2 and the target wavelengths from T1 to T2, with S1, S2, T1, and T2 being existing wavelengths in the current dataset.

Note that the single values have to be separated by a 'minus' ('-'). Please see examples and do_detrend for additional information.


Transform the dataset using gap-segment derivatives by calling internally the function do_gapDer (what in turn is relying on gapDer). Provide only the character "gsd" to use the standard values for m, w, s and deltaW, what are all 1. Use a string in the format "gsd@m-w-s-d", with m, w, s and d being integers to modify the behaviour of do_gapDer by supplying your own values; 'w' has to be odd. The single integers have to be separated by a 'minus' ('-'). Please not that the gap-derivative function will truncate your data at the first and last wavelengths, depending on the provided values.


Via the ... argument in the function getap, the values of both dpt.pre and dpt.post can be overridden. Via the separator '@', additional values can be appended to some of the single modules. Possible values for dpt-modules are 'sgol', 'snv', 'msc', 'emsc', 'osc', 'deTr', 'gsd'. Single modules can be combined and repeated in any arbitrary order, i.e. there is no upper limit on the modules that can be applied in the 'dpt.pre' and 'dpt.post' process – see examples.


Please see the description and split_dataset to understand when in the data-processing procedure the respective data treatment modules are applied!

See Also

split_dataset, anproc_file, getcd for extracting a dataset from a 'cube' object, do_avg for averaging datasets into a single spectrum, getcm for extracting a specific model from a 'cube' object.

Other dpt modules documentation: do_detrend(), do_emsc(), do_gapDer(), do_msc(), do_sgolay(), do_snv()

Other Data pre-treatment functions: [,aquap_data-method, do_addNoise(), do_avg(), do_blowup(), do_detrend(), do_emsc(), do_gapDer(), do_msc(), do_resampleNIR(), do_sgolay(), selectWls(), ssc()


## Not run: 
fd <- gfd() # load a dataset
cube <- gdmm(fd) # you should split in at least 2 or 3 groups
cube # look at the structure - each row of the cube is treated separately using 
# the modules specified in 'dpt.pre' and 'dpt.post' -- see description
## the argument 'dpt.pre' resp. 'dpt.post' in the analysis procedure could look 
## like this:
dpt.pre <- c("sgol@2-51-0")
dpt.post <- c("sgol@2-51-0", "snv")
dpt.pre <- c("msc", "snv", "msc") # (of course not useful, but it shows that 
## modules can be combined and repeated in any arbitrary order.)
dpt.post <- "msc@myDS" # with 'myDS' being the name of a standard dataset 
dpt.post <- c("sgol@2-51-0", "emsc@myDF") # with 'myDF' being the name of a 
## data frame containing one or two loading vectors or one regression vector
dpt.post <- "gsd@1-11-13-1"
dpt.post <- "deTr" # use whole wavelength range as source and target
dpt.post <- "deTr@1300-1600" # same target as source
dpt.pre <- c("sgol", "deTr@1300-1600-all") # apply de-trend calculated from 
# 1300nm to 1600nm to all wavelengths
cube <- gdmm(fd, getap(dpt.pre=c("sgol", "snv"))) # modify via gdmm function

## End(Not run)

bpollner/aquap2 documentation built on March 29, 2024, 7:33 a.m.