Forward Backward Early Dropping selection regression | R Documentation |
Forward Backward Early Dropping selection regression.
fbed.reg(y, x, alpha = 0.05, type = "logistic", K = 0, backward = FALSE,
parallel = FALSE, tol = 1e-07, maxiters = 100)
y |
The response variable, a numeric vector. |
x |
A matrix with continuous variables. |
alpha |
The significance threshold value for assessing p-values. Default value is 0.05. |
type |
The available types are: "logistic" (binary logistic regression), "qlogistic" (quasi logistic regression, for binary value or proportions including 0 and 1), "poisson" (Poisson regression), "qpoisson" (quasi Poisson regression), "weibull" (Weibull regression) and "spml" (SPML regression). |
K |
How many times should the process be repeated? The default value is 0. |
backward |
After the Forward Early Dropping phase, the algorithm proceeds witha the usual Backward Selection phase. The default value is set to TRUE. It is advised to perform this step as maybe some variables are false positives, they were wrongly selected. This is rather experimental now and there could be some mistakes in the indices of the selected variables. Do not use it for now. |
parallel |
If you want the algorithm to run in parallel set this TRUE. |
tol |
The tolerance value to terminate the Newton-Raphson algorithm. |
maxiters |
The maximum number of iterations Newton-Raphson will perform. |
The algorithm is a variation of the usual forward selection. At every step, the most significant variable enters the selected variables set. In addition, only the significant variables stay and are further examined. The non signifcant ones are dropped. This goes until no variable can enter the set. The user has the option to re-do this step 1 or more times (the argument K). In the end, a backward selection is performed to remove falsely selected variables. Note that you may have specified, for example, K=10, but the maximum value FBED used can be 4 for example.
The "qlogistic" and "qpoisson" proceed with the Wald test and no backward is performed, while for all the other regression types, the log-likelihood ratio test is used and backward phase is available.
If K is a single number a list including: Note, that the "gam" argument must be the same though.
res |
A matrix with the selected variables and their test statistic. |
info |
A matrix with the number of variables and the number of tests performed (or models fitted) at each round (value of K). This refers to the forward phase only. |
runtime |
The runtime required. |
Michail Tsagris and Stefanos Fafalios.
R implementation and documentation: Michail Tsagris mtsagris@uoc.gr and Stefanos Fafalios stefanosfafalios@gmail.com.
Borboudakis G. and Tsamardinos I. (2019). Forward-backward selection with early dropping. Journal of Machine Learning Research, 20(8): 1-39.
logiquant.regs, bic.regs, gee.reg
#simulate a dataset with continuous data
x <- matrix( runif(100 * 50, 1, 100), ncol = 50 )
y <- rnbinom(100, 10, 0.5)
a <- fbed.reg(y, x, type = "poisson")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.