fsSimilarity: Calculating similarity of two feature sets

View source: R/utils.R

fsSimilarityR Documentation

Calculating similarity of two feature sets

Description

fsSimilarity implements different methods for calculation similarity of two feature sets.

Usage

fsSimilarity(
  feature.set1,
  feature.set2,
  cutoff = FALSE,
  threshold = 1,
  method = c("Kuncheva", "Jaccard", "Hamming")
)

Arguments

feature.set1

a matrix that contains feature weights.

feature.set2

a matrix that contains feature weights.

cutoff

logical. If true, ihe input features sets are cut-off using the cutoff function with a specified threshold. By default is FALSE.

threshold

the threshold for feature selection using the cutoff function. By default is 1 (no cut-off)

method

a similarity metric. Implemented metrics:

  • "Jaccard" - a share of matching features to maximal possible number of matching features (Jaccard similarity)

  • "Kuncheva" - Kuncheva-like correction to the expected number of features matched by chance. See Kuncheva (2007)

  • "Hamming" - Hamming distance, normalised to [0,1], where 1 is for identical matrices

Value

returns a value from the [-1, 1] interval for Kuncheva and from the [0,1] interval for other algorithms, where 1 is for absolutely identical feature sets.

References

Kuncheva L., 2007, A stability index for feature selection. In: 25th IASTED international multi-conference: artificial intelligence and applications, pp. 390–395

Examples


# Load traffic data
data(traffic.mini)

# Scaling is sometimes useful for feature selection
# Exclude the first column - it contains timestamps
data <- scale(traffic.mini$data[,-1])

mCCF<-fsMTS(data, max.lag=3, method="CCF")
mLARS<-fsMTS(data, max.lag=3, method="LARS")
fsSimilarity(mCCF, mLARS, cutoff=TRUE, threshold=0.2, method="Kuncheva")
fsSimilarity(mCCF, mLARS, cutoff=TRUE, threshold=0.2, method="Jaccard")
fsSimilarity(mCCF, mLARS, cutoff=TRUE, threshold=0.2, method="Hamming")


fsMTS documentation built on April 26, 2022, 9:05 a.m.