stabilityKappa: Stability Measure Kappa
In stabm: Stability Measures for Feature Selection

View source: R/stability_functions_corrected.R

stabilityKappa

R Documentation

Stability Measure Kappa

Description

The stability of feature selection is defined as the robustness of the sets of selected features with respect to small variations in the data on which the feature selection is conducted. To quantify stability, several datasets from the same data generating process can be used. Alternatively, a single dataset can be split into parts by resampling. Either way, all datasets used for feature selection must contain exactly the same features. The feature selection method of interest is applied on all of the datasets and the sets of chosen features are recorded. The stability of the feature selection is assessed based on the sets of chosen features using stability measures.

Usage

stabilityKappa(features, p, impute.na = NULL)

Arguments

`features`	`list (length >= 2)` Chosen features per dataset. Each element of the list contains the features for one dataset. The features must be given by their names (`character`) or indices (`integerish`).
`p`	`numeric(1)` Total number of features in the datasets.
`impute.na`	`numeric(1)` In some scenarios, the stability cannot be assessed based on all feature sets. E.g. if some of the feature sets are empty, the respective pairwise comparisons yield NA as result. With which value should these missing values be imputed? `NULL` means no imputation.

Details

The stability measure is defined as the average kappa coefficient between all pairs of feature sets. It can be rewritten as (see Notation)

\frac{2}{m (m - 1)} \sum_{i=1}^{m-1} \sum_{j = i+1}^m \frac{|V_i \cap V_j| - \frac{|V_i| \cdot |V_j|}{p}} {\frac{|V_i| + |V_j|}{2} - \frac{|V_i| \cdot |V_j|}{p}}.

Value

numeric(1) Stability value.

Notation

For the definition of all stability measures in this package, the following notation is used: Let V_1, \ldots, V_m denote the sets of chosen features for the m datasets, i.e. features has length m and V_i is a set which contains the i-th entry of features. Furthermore, let h_j denote the number of sets that contain feature X_j so that h_j is the absolute frequency with which feature X_j is chosen. Analogously, let h_{ij} denote the number of sets that include both X_i and X_j. Also, let q = \sum_{j=1}^p h_j = \sum_{i=1}^m |V_i| and V = \bigcup_{i=1}^m V_i.

References

Carletta, Jean (1996). “Assessing Agreement on Classification Tasks: The Kappa Statistic.” Computational Linguistics, 22(2), 249–254.

Bommert A (2020). Integration of Feature Selection Stability in Model Fitting. Ph.D. thesis, TU Dortmund University, Germany. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.17877/DE290R-21906")}.

Examples

feats = list(1:3, 1:4, 1:5)
stabilityKappa(features = feats, p = 10)

stabm documentation built on April 4, 2023, 5:12 p.m.

stabm index

Package overview README.md stabm stabm"

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

stabm
Stability Measures for Feature Selection

stabilityKappa: Stability Measure Kappa
In stabm: Stability Measures for Feature Selection

Stability Measure Kappa

Description

Usage

Arguments

Details

Value

Notation

References

See Also

Examples

Related to stabilityKappa in stabm...

R Package Documentation

Browse R Packages

We want your feedback!

stabm Stability Measures for Feature Selection

stabilityKappa: Stability Measure Kappa In stabm: Stability Measures for Feature Selection

Stability Measure Kappa

Description

Usage

Arguments

Details

Value

Notation

References

See Also

Examples

Related to stabilityKappa in stabm...

R Package Documentation

Browse R Packages

We want your feedback!

stabm
Stability Measures for Feature Selection

stabilityKappa: Stability Measure Kappa
In stabm: Stability Measures for Feature Selection