Forward stepwise selection procedure for penalized logistic regression
Description
This function fits a series of L2 penalized logistic regression models selecting variables through the forward stepwise selection procedure.
Usage
1 2 3 4 
Arguments
x 
matrix of features 
y 
binary response 
weights 
optional vector of weights for observations 
fix.subset 
vector of indices for the variables that are forced to be in the model 
level 
list of length 
lambda 
regularization parameter for the L2 norm of the
coefficients. The minimizing criterion in 
cp 
complexity parameter to be used when computing the
score. 
max.terms 
maximum number of terms to be added in the forward selection
procedure. Default is 
type 
If 
trace 
If 
Details
This function implements an L2 penalized logistic regression along with the stepwise variable selection procedure, as described in "Penalized Logistic Regression for Detecting Gene Interactions (2008)" by Park and Hastie.
If type="forward",
max.terms
terms are sequentially
added to the model, and the model that minimizes score
is
selected as the optimal fit. If type="both",
a backward
deletion is done in addition, which provides a series of models with a
different combination of the selected terms. The optimal model
minimizing score
is chosen from the second list.
Value
A stepplr
object is returned. anova, predict, print,
and
summary
functions can be applied.
fit 

action 
list that stores the selection order of the terms in the optimal model 
action.name 
list of the names of the sequentially added terms  in the same
order as in 
deviance 
deviance of the fitted model 
df 
residual degrees of freedom of the fitted model 
score 
deviance + cp*df, where df is the model degrees of freedom 
group 
vector of the counts for the dummy variables, to be used in

y 
response variable used 
weight 
weights used 
fix.subset 
fix.subset used 
level 
level used 
lambda 
lambda used 
cp 
complexity parameter used when computing the score 
type 
type used 
xnames 
column names of 
Author(s)
Mee Young Park and Trevor Hastie
References
Mee Young Park and Trevor Hastie (2008) Penalized Logistic Regression for Detecting Gene Interactions
See Also
cv.step.plr, plr, predict.stepplr
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21  n < 100
p < 3
z < matrix(sample(seq(3),n*p,replace=TRUE),nrow=n)
x < data.frame(x1=factor(z[ ,1]),x2=factor(z[ ,2]),x3=factor(z[ ,3]))
y < sample(c(0,1),n,replace=TRUE)
fit < step.plr(x,y)
# 'level' is automatically generated. Check 'fit$level'.
p < 5
x < matrix(sample(seq(3),n*p,replace=TRUE),nrow=n)
x < cbind(rnorm(n),x)
y < sample(c(0,1),n,replace=TRUE)
level < vector("list",length=6)
for (i in 2:6) level[[i]] < seq(3)
fit1 < step.plr(x,y,level=level,cp="aic")
fit2 < step.plr(x,y,level=level,cp=4)
fit3 < step.plr(x,y,level=level,type="forward")
fit4 < step.plr(x,y,level=level,max.terms=10)
# This is an example in which 'level' was input manually.
# level[[1]] should be either 'NULL' or 'NA' since the first factor is continuous.

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker. Vote for new features on Trello.