forward: Iterative bias reduction smoothing
In ibr: Iterative Bias Reduction

forward

R Documentation

Iterative bias reduction smoothing

Description

Performs a forward variable selection for iterative bias reduction using kernel, thin plate splines or low rank splines. Missing values are not allowed.

Usage

forward(formula,data,subset,criterion="gcv",df=1.5,Kmin=1,Kmax=1e+06,
   smoother="k",kernel="g",rank=NULL,control.par=list(),cv.options=list(),
   varcrit=criterion)

Arguments

`formula`	An object of class `"formula"` (or one that can be coerced to that class): a symbolic description of the model to be fitted.
`data`	An optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `forward` is called.
`subset`	An optional vector specifying a subset of observations to be used in the fitting process.
`criterion`	Character string. If the number of iterations (`iter`) is missing or `NULL` the number of iterations is chosen using `criterion`. The criteria available are GCV (default, `"gcv"`), AIC (`"aic"`), corrected AIC (`"aicc"`), BIC (`"bic"`), gMDL (`"gmdl"`), map (`"map"`) or rmse (`"rmse"`). The last two are designed for cross-validation.
`df`	A numeric vector of either length 1 or length equal to the number of columns of `x`. If `smoother="k"`, it indicates the desired degree of freedom (trace) of the smoothing matrix for each variable or for the initial smoother (see `contr.sp$dftotal`); `df` is repeated when the length of vector `df` is 1. If `smoother="tps"`, the minimum df of thin plate splines is multiplied by `df`. This argument is useless if `bandwidth` is supplied (non null).
`Kmin`	The minimum number of bias correction iterations of the search grid considered by the model selection procedure for selecting the optimal number of iterations.
`Kmax`	The maximum number of bias correction iterations of the search grid considered by the model selection procedure for selecting the optimal number of iterations.
`smoother`	Character string which allows to choose between thine plate splines `"tps"` or kernel (`"k"`).
`kernel`	Character string which allows to choose between gaussian kernel (`"g"`), Epanechnikov (`"e"`), uniform (`"u"`), quartic (`"q"`). The default (gaussian kernel) is strongly advised.
`rank`	Numeric value that control the rank of low rank splines (denoted as `k` in mgcv package ; see also choose.k for further details or mgcv::gam for another smoothing approach with reduced rank smoother.
`control.par`	a named list that control optional parameters. The components are `bandwidth` (default to NULL), `iter` (default to NULL), `really.big` (default to `FALSE`), `dftobwitmax` (default to 1000), `exhaustive` (default to `FALSE`),`m` (default to NULL), `dftotal` (default to `FALSE`), `accuracy` (default to 0.01), `ddlmaxi` (default to 2n/3) and `fraction` (default to `c(100, 200, 500, 1000, 5000,10^4,5e+04,1e+05,5e+05,1e+06)`). `bandwidth`: a vector of either length 1 or length equal to the number of columns of `x`. If `smoother="k"`, it indicates the bandwidth used for each variable, bandwidth is repeated when the length of vector `bandwidth` is 1. If `smoother="tps"`, it indicates the amount of penalty (coefficient lambda). The default (missing) indicates, for `smoother="k"`, that bandwidth for each variable is chosen such that each univariate kernel smoother (for each explanatory variable) has `df` degrees of freedom and for `smoother="tps"` that lambda is chosen such that the df of the smoothing matrix is `df` times the minimum df. `iter`: the number of iterations. If null or missing, an optimal number of iterations is chosen from the search grid (integer from `Kmin` to `Kmax`) to minimize the `criterion`. `really.big`: a boolean: if `TRUE` it overides the limitation at 500 observations. Expect long computation times if `TRUE`. `dftobwitmax`: When bandwidth is chosen by specifying the degree of freedom (see `df`) a search is done by `stats::uniroot`. This argument specifies the maximum number of iterations transmitted to `stats::uniroot` function. `exhaustive`: boolean, if `TRUE` an exhaustive search of optimal number of iteration on the grid `Kmin:Kmax` is performed. If `FALSE` the minimum of criterion is searched using `stats::optimize` between `Kmin` and `Kmax`. `m`: the order of thin plate splines. This integer m must verifies 2m/d>1, where d is the number of explanatory variables. The missing default to choose the order m as the first integer such that 2m/d>1, where d is the number of explanatory variables (same for `NULL`). `dftotal`: a boolean wich indicates when `FAlSE` that the argument `df` is the objective df for each univariate kernel (the default) calculated for each explanatory variable or for the overall (product) kernel, that is the base smoother (when `TRUE`). `accuracy`: tolerance when searching bandwidths which lead to a chosen overall intial df. `dfmaxi`: the maximum degree of freedom allowed for iterated biased reduction smoother. `fraction`: the subdivistion of interval `Kmin`,`Kmax` if non exhaustive search is performed (see also `iterchoiceA` or `iterchoiceS1`).
`cv.options`	A named list which controls the way to do cross validation with component `bwchange`, `ntest`, `ntrain`, `Kfold`, `type`, `seed`, `method` and `npermut`. `bwchange` is a boolean (default to `FALSE`) which indicates if bandwidth have to be recomputed each time. `ntest` is the number of observations in test set and `ntrain` is the number of observations in training set. Actually, only one of these is needed the other can be `NULL` or missing. `Kfold` a boolean or an integer. If `Kfold` is `TRUE` then the number of fold is deduced from `ntest` (or `ntrain`). `type` is a character string in `random`,`timeseries`,`consecutive`, `interleaved` and give the type of segments. `seed` controls the seed of random generator. `method` is either `"inmemory"` or `"outmemory"`; `"inmemory"` induces some calculations outside the loop saving computational time but leading to an increase of the required memory. `npermut` is the number of random draws. If `cv.options` is `list()`, then component `ntest` is set to `floor(nrow(x)/10)`, `type` is random, `npermut` is 20 and `method` is `"inmemory"`, and the other components are `NULL`
`varcrit`	Character string. Criterion used for variable selection. The criteria available are GCV, AIC (`"aic"`), corrected AIC (`"aicc"`), BIC (`"bic"`) and gMDL (`"gmdl"`).

Value

Returns an object of class forwardibr which is a matrix with p columns. In the first row, each entry j contains the value of the chosen criterion for the univariate smoother using the jth explanatory variable. The variable which realize the minimum of the first row is included in the model. All the column of this variable will be Inf except the first row. In the second row, each entry j contains the bivariate smoother using the jth explanatory variable and the variable already included. The variable which realize the minimum of the second row is included in the model. All the column of this variable will be Inf except the two first row. This forward selection process continue until the chosen criterion increases.

Author(s)

Pierre-Andre Cornillon, Nicolas Hengartner and Eric Matzner-Lober.

References

Cornillon, P.-A.; Hengartner, N.; Jegou, N. and Matzner-Lober, E. (2012) Iterative bias reduction: a comparative study. Statistics and Computing, 23, 777-791.

Cornillon, P.-A.; Hengartner, N. and Matzner-Lober, E. (2013) Recursive bias estimation for multivariate regression smoothers Recursive bias estimation for multivariate regression smoothers. ESAIM: Probability and Statistics, 18, 483-502.

Cornillon, P.-A.; Hengartner, N. and Matzner-Lober, E. (2017) Iterative Bias Reduction Multivariate Smoothing in R: The ibr Package. Journal of Statistical Software, 77, 1–26.

Examples

## Not run: 
data(ozone, package = "ibr")
res.ibr <- forward(ozone[,-1],ozone[,1],df=1.2)
apply(res.ibr,1,which.min)

## End(Not run)

ibr documentation built on Sept. 12, 2025, 10:19 a.m.

ibr index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ibr
Iterative Bias Reduction

forward: Iterative bias reduction smoothing
In ibr: Iterative Bias Reduction

Iterative bias reduction smoothing

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Related to forward in ibr...

R Package Documentation

Browse R Packages

We want your feedback!

ibr Iterative Bias Reduction

forward: Iterative bias reduction smoothing In ibr: Iterative Bias Reduction

Iterative bias reduction smoothing

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

Related to forward in ibr...

R Package Documentation

Browse R Packages

We want your feedback!

ibr
Iterative Bias Reduction

forward: Iterative bias reduction smoothing
In ibr: Iterative Bias Reduction