fsSimilarity: Calculating similarity of two feature sets
In fsMTS: Feature Selection for Multivariate Time Series

fsSimilarity

R Documentation

Calculating similarity of two feature sets

Description

fsSimilarity implements different methods for calculation similarity of two feature sets.

Usage

fsSimilarity(
  feature.set1,
  feature.set2,
  cutoff = FALSE,
  threshold = 1,
  method = c("Kuncheva", "Jaccard", "Hamming")
)

Arguments

`feature.set1`	a matrix that contains feature weights.
`feature.set2`	a matrix that contains feature weights.
`cutoff`	logical. If true, ihe input features sets are cut-off using the `cutoff` function with a specified `threshold`. By default is FALSE.
`threshold`	the threshold for feature selection using the `cutoff` function. By default is 1 (no cut-off)
`method`	a similarity metric. Implemented metrics: "Jaccard" - a share of matching features to maximal possible number of matching features (Jaccard similarity) "Kuncheva" - Kuncheva-like correction to the expected number of features matched by chance. See Kuncheva (2007) "Hamming" - Hamming distance, normalised to [0,1], where 1 is for identical matrices

Value

returns a value from the [-1, 1] interval for Kuncheva and from the [0,1] interval for other algorithms, where 1 is for absolutely identical feature sets.

References

Kuncheva L., 2007, A stability index for feature selection. In: 25th IASTED international multi-conference: artificial intelligence and applications, pp. 390–395

Examples


# Load traffic data
data(traffic.mini)

# Scaling is sometimes useful for feature selection
# Exclude the first column - it contains timestamps
data <- scale(traffic.mini$data[,-1])

mCCF<-fsMTS(data, max.lag=3, method="CCF")
mLARS<-fsMTS(data, max.lag=3, method="LARS")
fsSimilarity(mCCF, mLARS, cutoff=TRUE, threshold=0.2, method="Kuncheva")
fsSimilarity(mCCF, mLARS, cutoff=TRUE, threshold=0.2, method="Jaccard")
fsSimilarity(mCCF, mLARS, cutoff=TRUE, threshold=0.2, method="Hamming")

fsMTS documentation built on April 26, 2022, 9:05 a.m.