aicplsr: AIC and Cp for PLSR1 Models
In mlesnoff/rnirs: Dimension reduction, Regression and Discrimination for Chemometrics

aicplsr

R Documentation

AIC and Cp for PLSR1 Models

Description

Calculation of the AIC and Mallows's Cp criteria for univariate PLSR models. This function may receive modifications in the future (work in progress).

For a model with a componetnts (LV), function aicplsr calculates AIC and Cp by:

AIC(a) = n * log(SSR(a)) + 2 * (df(a) + 1)

Cp(a) = SSR(a) / n + 2 * df(a) * s2 / n

where SSR is the sum of squared residuals for the current evaluated model, df(a) the estimated PLSR model complexity (i.e. nb. model's degrees of freedom), s2 an estimate of the irreductible error variance (computed from a low biased model) and n the number of training observations.

By default (argument correct), the small sample size correction (so-called AICc) is applied to AIC and Cp for deucing the bias.

The functions retunrs two Cp estimates, each corresponding to a different estimate of s2 (publication to come).

## Model complexity df

Depending on argument methdf, model complexity df can be estimated

- from fonctions dfplsr_div or dfplsr_cov,

- or from the following ad'hoc and crude rule of thumb: 1 + theta * a where theta is a given scalar higher or equal to 1 (theta = 1 is the naive df estimation, which gives significant underestimations of Cp.). Parameter theta is data dependent and some knowledge on his average value should be available before using this crude rule of thumb..

Usage


aicplsr(
    X, Y, ncomp, algo = NULL,
    methdf = c("div", "cov", "crude"),
    theta = 3,
    correct = TRUE,
    B = 50,
    print = TRUE, 
    ...
    )

Arguments

`X`	A `n x p` matrix or data frame of training observations.
`Y`	A vector of length `n` of training responses.
`ncomp`	The maximal number of PLS scores (= components = latent variables) to consider.
`algo`	a PLS algorithm. Default to `NULL` (`pls_kernel` is used).
`methdf`	The method used for estimating `df`. Possible values are `"div"` (default), `"cov"` or `"crude"`.
`theta`	A scalar used in the crude estimation of `df`.
`correct`	Logical. If codeTRUE (default), the AICc corection is applied to the criteria.
`B`	For `methdf = "dif"`: the number of observations in the data receiving perturbation (maximum is `n`; see `dfplsr_cov`). For `methdf = "cov"`: the number of bootstrap replications (see `dfplsr_cov`).
`print`	Logical. If `TRUE`, fitting information are printed.
`...`	Optionnal arguments to pass in functions `dfplsr_div` or `dfplsr_cov`.

Value

A list of items. In particular, a data.frame with the estimated criteria.

References

Burnham, K.P., Anderson, D.R., 2002. Model selection and multimodel inference: a practical informationtheoretic approach, 2nd ed. Springer, New York, NY, USA.

Burnham, K.P., Anderson, D.R., 2004. Multimodel Inference: Understanding AIC and BIC in Model Selection. Sociological Methods & Research 33, 261â304. https://doi.org/10.1177/0049124104268644

Efron, B., 2004. The Estimation of Prediction Error. Journal of the American Statistical Association 99, 619â632. https://doi.org/10.1198/016214504000000692

Eubank, R.L., 1999. Nonparametric Regression and Spline Smoothing, 2nd ed, Statistics: Textbooks and Monographs. Marcel Dekker, Inc., New York, USA.

Hastie, T., Tibshirani, R.J., 1990. Generalized Additive Models, Monographs on statistics and applied probablity. Chapman and Hall/CRC, New York, USA.

Hastie, T., Tibshirani, R., Friedman, J., 2009. The elements of statistical learning: data mining, inference, and prediction, 2nd ed. Springer, NewYork.

Hastie, T., Tibshirani, R., Wainwright, M., 2015. Statistical Learning with Sparsity: The Lasso and Generalizations. CRC Press

Hurvich, C.M., Tsai, C.-L., 1989. Regression and Time Series Model Selection in Small Samples. Biometrika 76, 297. https://doi.org/10.2307/2336663

Mallows, C.L., 1973. Some Comments on Cp. Technometrics 15, 661â675. https://doi.org/10.1080/00401706.1973.10489103

Ye, J., 1998. On Measuring and Correcting the Effects of Data Mining and Model Selection. Journal of the American Statistical Association 93, 120â131. https://doi.org/10.1080/01621459.1998.10474094

Zuccaro, C., 1992. Mallowsâ Cp Statistic and Model Selection in Multiple Linear Regression. International Journal of Market Research. 34, 1â10. https://doi.org/10.1177/147078539203400204

Examples


data(datcass)

Xr <- datcass$Xr
yr <- datcass$yr

ncomp <- 20
B <- 30
res <- aicplsr(
    Xr, yr, ncomp = ncomp, 
    methdf = "cov", B = B
    )
names(res)
res$crit
res$opt
z <- res$crit
par(mfrow = c(1, 4))
plot(z$df[-1])
plot(z$aic[-1], type = "b", main = "AIC")
plot(z$cp1[-1], type = "b", main = "Cp1")
plot(z$cp2[-1], type = "b", main = "Cp2")
par(mfrow = c(1, 1))

mlesnoff/rnirs documentation built on April 24, 2023, 4:17 a.m.

mlesnoff/rnirs index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mlesnoff/rnirs
Dimension reduction, Regression and Discrimination for Chemometrics

aicplsr: AIC and Cp for PLSR1 Models
In mlesnoff/rnirs: Dimension reduction, Regression and Discrimination for Chemometrics

AIC and Cp for PLSR1 Models

Description

Usage

Arguments

Value

References

Examples

Related to aicplsr in mlesnoff/rnirs...

R Package Documentation

Browse R Packages

We want your feedback!

mlesnoff/rnirs Dimension reduction, Regression and Discrimination for Chemometrics

aicplsr: AIC and Cp for PLSR1 Models In mlesnoff/rnirs: Dimension reduction, Regression and Discrimination for Chemometrics

AIC and Cp for PLSR1 Models

Description

Usage

Arguments

Value

References

Examples

Related to aicplsr in mlesnoff/rnirs...

R Package Documentation

Browse R Packages

We want your feedback!

mlesnoff/rnirs
Dimension reduction, Regression and Discrimination for Chemometrics

aicplsr: AIC and Cp for PLSR1 Models
In mlesnoff/rnirs: Dimension reduction, Regression and Discrimination for Chemometrics