Description Usage Arguments Details Value Author(s) See Also Examples

Forward/backward variable selection for classification using any specified
classification function and selecting by estimated classification performance measure from `ucpm`

.

1 2 3 4 5 6 7 8 9 | ```
stepclass(x, ...)
## Default S3 method:
stepclass(x, grouping, method, improvement = 0.05, maxvar = Inf,
start.vars = NULL, direction = c("both", "forward", "backward"),
criterion = "CR", fold = 10, cv.groups = NULL, output = TRUE,
min1var = TRUE, ...)
## S3 method for class 'formula'
stepclass(formula, data, method, ...)
``` |

`x` |
matrix or data frame containing the explanatory variables
(required, if |

`formula` |
A formula of the form |

`data` |
data matrix (rows=cases, columns=variables) |

`grouping` |
class indicator vector (a factor) |

`method` |
character, name of classification function
(e.g. “ |

`improvement` |
least improvement of performance measure desired to include or exclude any variable (<=1) |

`maxvar` |
maximum number of variables in model |

`start.vars` |
set variables to start with (indices or names).
Default is no variables if ‘ |

`direction` |
“ |

`criterion` |
performance measure taken from |

`fold` |
parameter for cross-validation; omitted if ‘ |

`cv.groups` |
vector of group indicators for cross-validation. By default assigned automatically. |

`output` |
indicator (logical) for textoutput during computation (slows down computation!) |

`min1var` |
logical, whether to include at least one variable in the model, even if the prior itself already is a reasonable model. |

`...` |
further parameters passed to classification function (‘ |

The classification “method” (e.g. ‘`lda`

’) must have its own
‘`predict`

’ method (like ‘`predict.lda`

’ for ‘`lda`

’)
that either returns a matrix of posterior probabilities or a list with an element ‘`posterior`

’ containing
that matrix instead. It must be able to deal with matrices as in `method(x, grouping, ...)`

Then a stepwise variable selection is performed.
The initial model is defined by the provided starting variables;
in every step new models are generated by including every single
variable that is not in the model, and by excluding every single
variable that is in the model. The resulting performance measure for these
models are estimated (by cross-validation), and if the maximum value of the chosen
criterion is better than ‘`improvement`

’ plus the value so far, the
corresponding variable is in- or excluded. The procedure stops, if
the new best value is not good enough, or if the specified maximum
number of variables is reached.

If ‘`direction`

’ is “`forward`

”, the model is only extended (by including
further variables), if ‘`direction`

’ is “`backward`

”, the model is only
reduced (by excluding variables from the model).

An object of class ‘`stepclass`

’ containing the following components:

`call` |
the (matched) function call. |

`method` |
name of classification function used (e.g. “ |

`start.variables` |
vector of starting variables. |

`process` |
data frame showing selection process (included/excluded variables and performance measure). |

`model` |
the final model: data frame with 2 columns; indices and names of variables. |

`perfomance.measure` |
value of the criterion used by |

`formula` |
formula of the form ‘ |

Christian Röver, [email protected], Irina Czogiel

`step`

, `stepAIC`

,
and `greedy.wilks`

for stepwise variable selection according to Wilk's lambda

1 2 3 4 5 6 7 8 9 10 11 12 13 14 | ```
data(iris)
library(MASS)
iris.d <- iris[,1:4] # the data
iris.c <- iris[,5] # the classes
sc_obj <- stepclass(iris.d, iris.c, "lda", start.vars = "Sepal.Width")
sc_obj
plot(sc_obj)
## or using formulas:
sc_obj <- stepclass(Species ~ ., data = iris, method = "qda",
start.vars = "Sepal.Width", criterion = "AS") # same as above
sc_obj
## now you can say stuff like
## qda(sc_obj$formula, data = B3)
``` |

```
Loading required package: MASS
`stepwise classification', using 10-fold cross-validated correctness rate of method lda'.
150 observations of 4 variables in 3 classes; direction: both
stop criterion: improvement less than 5%.
correctness rate: 0.56; starting variables (1): Sepal.Width
correctness rate: 0.96; in: "Petal.Width"; variables (2): Sepal.Width, Petal.Width
hr.elapsed min.elapsed sec.elapsed
0.000 0.000 0.266
method : lda
final model : iris.c ~ Sepal.Width + Petal.Width
<environment: 0x17d5c50>
correctness rate = 0.96
`stepwise classification', using 10-fold cross-validated abiltity to seperate of method qda'.
150 observations of 4 variables in 3 classes; direction: both
stop criterion: improvement less than 5%.
abiltity to seperate: 0.33766; starting variables (1): Sepal.Width
abiltity to seperate: 0.94814; in: "Petal.Width"; variables (2): Sepal.Width, Petal.Width
hr.elapsed min.elapsed sec.elapsed
0.000 0.000 0.159
method : qda
final model : Species ~ Sepal.Width + Petal.Width
<environment: 0x2afa5e0>
abiltity to seperate = 0.9481
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.