fbed.reg: Forward Backward Early Dropping selection regression

Forward Backward Early Dropping selection regressionR Documentation

Forward Backward Early Dropping selection regression

Description

Forward Backward Early Dropping selection regression.

Usage

fbed.reg(y, x, alpha = 0.05, type = "logistic", K = 0, backward = FALSE, 
         parallel = FALSE, tol = 1e-07, maxiters = 100) 

Arguments

y

The response variable, a numeric vector.

x

A matrix with continuous variables.

alpha

The significance threshold value for assessing p-values. Default value is 0.05.

type

The available types are: "logistic" (binary logistic regression), "qlogistic" (quasi logistic regression, for binary value or proportions including 0 and 1), "poisson" (Poisson regression), "qpoisson" (quasi Poisson regression), "weibull" (Weibull regression) and "spml" (SPML regression).

K

How many times should the process be repeated? The default value is 0.

backward

After the Forward Early Dropping phase, the algorithm proceeds witha the usual Backward Selection phase. The default value is set to TRUE. It is advised to perform this step as maybe some variables are false positives, they were wrongly selected. This is rather experimental now and there could be some mistakes in the indices of the selected variables. Do not use it for now.

parallel

If you want the algorithm to run in parallel set this TRUE.

tol

The tolerance value to terminate the Newton-Raphson algorithm.

maxiters

The maximum number of iterations Newton-Raphson will perform.

Details

The algorithm is a variation of the usual forward selection. At every step, the most significant variable enters the selected variables set. In addition, only the significant variables stay and are further examined. The non signifcant ones are dropped. This goes until no variable can enter the set. The user has the option to re-do this step 1 or more times (the argument K). In the end, a backward selection is performed to remove falsely selected variables. Note that you may have specified, for example, K=10, but the maximum value FBED used can be 4 for example.

The "qlogistic" and "qpoisson" proceed with the Wald test and no backward is performed, while for all the other regression types, the log-likelihood ratio test is used and backward phase is available.

Value

If K is a single number a list including: Note, that the "gam" argument must be the same though.

res

A matrix with the selected variables and their test statistic.

info

A matrix with the number of variables and the number of tests performed (or models fitted) at each round (value of K). This refers to the forward phase only.

runtime

The runtime required.

Author(s)

Michail Tsagris and Stefanos Fafalios.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr and Stefanos Fafalios stefanosfafalios@gmail.com.

References

Borboudakis G. and Tsamardinos I. (2019). Forward-backward selection with early dropping. Journal of Machine Learning Research, 20(8): 1-39.

See Also

logiquant.regs, bic.regs, gee.reg

Examples

#simulate a dataset with continuous data
x <- matrix( runif(100 * 50, 1, 100), ncol = 50 )
y <- rnbinom(100, 10, 0.5)
a <- fbed.reg(y, x, type = "poisson") 

Rfast2 documentation built on Aug. 8, 2023, 1:11 a.m.