View source: R/similarity_phi.R
phi | R Documentation |
Measure to compare two or more sets w.r.t. their similarity.
phi(sets, p, na_value = NaN, ...)
sets |
( |
p |
( |
na_value |
( |
... |
( |
The Phi Coefficient is defined as the Pearson correlation between the binary
representation of two sets A
and B
.
The binary representation for A
is a logical vector of
length p
with the i-th element being 1 if the corresponding
element is in A
, and 0 otherwise.
If more than two sets are provided, the mean of all pairwise scores is calculated.
This measure is undefined if one set contains none or all possible elements.
Performance value as numeric(1)
.
Type: "similarity"
Range: [-1, 1]
Minimize: FALSE
Nogueira S, Brown G (2016). “Measuring the Stability of Feature Selection.” In Machine Learning and Knowledge Discovery in Databases, 442–457. Springer International Publishing. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-3-319-46227-1_28")}.
Bommert A, Rahnenführer J, Lang M (2017). “A Multicriteria Approach to Find Predictive and Sparse Models with Stable Feature Selection for High-Dimensional Data.” Computational and Mathematical Methods in Medicine, 2017, 1–18. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1155/2017/7907163")}.
Bommert A, Lang M (2021). “stabm: Stability Measures for Feature Selection.” Journal of Open Source Software, 6(59), 3010. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.21105/joss.03010")}.
Package stabm which implements many more stability measures with included correction for chance.
Other Similarity Measures:
jaccard()
set.seed(1)
sets = list(
sample(letters[1:3], 1),
sample(letters[1:3], 2)
)
phi(sets, p = 3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.