qselection: Selecting variables for several subset sizes
In sestelo/fwdselect: Selecting Variables in Regression Models

Description Usage Arguments Value Author(s) See Also Examples

Function that enables to obtain the best variables for more than one size of subset. Returns a table with the chosen covariates to be introduced into the models and their information criteria. Additionally, an asterisk is shown next to the size of subset which minimizes the information criterion.

1 2	qselection(x, y, qvector, criterion = "deviance", method = "lm", family = "gaussian", nfolds = 5, cluster = TRUE, ncores = NULL)

`x`	A data frame containing all the covariates.
`y`	A vector with the response values.
`qvector`	A vector with more than one variable-subset size to be selected.
`criterion`	The information criterion to be used. Default is the deviance. Other functions provided are the coefficient of determination (`"R2"`), the residual variance (`"variance"`), the Akaike information criterion (`"aic"`), AIC with a correction for finite sample sizes (`"aicc"`) and the Bayesian information criterion (`"bic"`). The deviance, coefficient of determination and variance are calculated by cross-validation.
`method`	A character string specifying which regression method is used, i.e., linear models (`"lm"`), generalized additive models (`"glm"`) or generalized additive models (`"gam"`).
`family`	A description of the error distribution and link function to be used in the model: (`"gaussian"`), (`"binomial"`) or (`"poisson"`).
`nfolds`	Number of folds for the cross-validation procedure, for `deviance`, `R2` or `variance` criterion.
`cluster`	A logical value. If `TRUE` (default), the procedure is parallelized. Note that there are cases without enough repetitions (e.g., a low number of initial variables) that R will gain in performance through serial computation. R takes time to distribute tasks across the processors also it will need time for binding them all together later on. Therefore, if the time for distributing and gathering pieces together is greater than the time need for single-thread computing, it does not worth parallelize.
`ncores`	An integer value specifying the number of cores to be used in the parallelized procedure. If `NULL` (default), the number of cores to be used is equal to the number of cores of the machine - 1.

`q`	A vector of subset sizes.
`criterion`	A vector of Information criterion values.
`selection`	Selected variables for each size.

Marta Sestelo, Nora M. Villanueva and Javier Roca-Pardinas.

selection plot.qselection.

library(FWDselect)
data(diabetes)
x = diabetes[ ,2:11]
y = diabetes[ ,1]
obj2 = qselection(x, y, qvector = c(1:9), method = "lm", criterion = "variance", cluster = FALSE)
obj2