lin1: Lin 1 (LIN1) Measure

View source: R/lin1.R

lin1R Documentation

Lin 1 (LIN1) Measure

Description

The function calculates a dissimilarity matrix based on the LIN1 similarity measure.

Usage

lin1(data, var.weights = NULL)

Arguments

data

A data.frame or a matrix with cases in rows and variables in columns.

var.weights

A numeric vector setting weights to the used variables. One can choose the real numbers from zero to one.

Details

The Lin 1 similarity measure was introduced in (Boriah et al., 2008) as a modification of the original Lin measure (Lin, 1998). In has a complex system of weights. In case of mismatch, lower similarity is assigned if either the mismatching values are very frequent or their relative frequency is in between the relative frequencies of mismatching values. Higher similarity is assigned if the mismatched categories are infrequent and there are a few other infrequent categories. In case of match, lower similarity is given for matches on frequent categories or matches on categories that have many other values of the same frequency. Higher similarity is given to matches on infrequent categories.

Value

The function returns an object of the class "dist".

Author(s)

Zdenek Sulc.
Contact: zdenek.sulc@vse.cz

References

Boriah S., Chandola V., Kumar V. (2008). Similarity measures for categorical data: A comparative evaluation. In: Proceedings of the 8th SIAM International Conference on Data Mining, SIAM, p. 243-254.

Lin D. (1998). An information-theoretic definition of similarity. In: ICML '98: Proceedings of the 15th International Conference on Machine Learning. San Francisco, p. 296-304.

See Also

anderberg, burnaby, eskin, gambaryan, goodall1, goodall2, goodall3, goodall4, iof, lin, of, sm, smirnov, ve, vm.

Examples

# sample data
data(data20)

# dissimilarity matrix calculation
prox.lin1 <- lin1(data20)

# dissimilarity matrix calculation with variable weights
weights.lin1 <- lin1(data20, var.weights = c(0.7, 1, 0.9, 0.5, 0))


nomclust documentation built on Aug. 18, 2023, 5:06 p.m.

Related to lin1 in nomclust...