NominalDistances: Distances among individuals with nominal variables
In MultBiplotR: Multivariate Analysis Using Biplots in R

NominalDistances

R Documentation

Distances among individuals with nominal variables

Description

This function computes several measures of distance (or similarity) among individuals from a nominal data matrix.

Usage

NominalDistances(X, method = 1, diag = FALSE, upper = FALSE, similarity = TRUE)

Arguments

`X`	Matrix or data.frame with the nominal variables.
`method`	An integer between 1 and 6. See details
`diag`	A logical value indicating whether the diagonal of the distance matrix should be printed.
`upper`	a logical value indicating whether the upper triangle of the distance matrix should be printed.
`similarity`	A logical value indicating whether the similarity matrix should be computed.

Details

Let be the table of nominal data. All these distances are of type d=\sqrt{1-s} with s a similarity coefficient.

1 = Overlap method: The overlap measure simply counts the number of attributes that match in the two data instances.
2 = Eskin: Eskin et al. proposed a normalization kernel for record-based network intrusion detection data. The original measure is distance-based and assigns a weight of \frac{2}{n_{k}^{2}} for mismatches; when adapted to similarity, this becomes a weight of \frac{n_{k}^{2}}{n_{k}^{2}+2}.This measure gives more weight to mismatches that occur on attributes that take many values.
3=IOF (Inverse Occurrence Frequency .): This measure assigns lower similarity to mismatches on more frequent values. The IOF measure is related to the concept of inverse document frequency which comes from information retrieval, where it is used to signify the relative number of documents that contain a spe- cific word.
4 = OF (Ocurrence Frequency): This measure gives the opposite weighting of the IOF measure for mismatches, i.e., mismatches on less frequent values are assigned lower similarity and mismatches on more frequent values are assigned higher similarity
5 = Goodall3: This measure assigns a high similarity if the matching values are infrequent regardless of the frequencies of the other values.
6 = Lin: This measure gives higher weight to matches on frequent values, and lower weight to mismatches on infrequent values.

Value

An object of class distance

Author(s)

Jose L. Vicente-Villardon

References

Boriah, S., Chandola, V. & Kumar,V.(2008). Similarity measures for categorical data: A comparative evaluation. In proceedings of the eight SIAM International Conference on Data Mining, pp 243–254.

Examples

## Not run: 
data(Env)
Distance<-NominalDistances(Env,upper=TRUE,diag=TRUE,similarity=FALSE,method=1)

## End(Not run)

MultBiplotR documentation built on Nov. 21, 2023, 5:08 p.m.

MultBiplotR index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

MultBiplotR
Multivariate Analysis Using Biplots in R

NominalDistances: Distances among individuals with nominal variables
In MultBiplotR: Multivariate Analysis Using Biplots in R

Distances among individuals with nominal variables

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to NominalDistances in MultBiplotR...

R Package Documentation

Browse R Packages

We want your feedback!

MultBiplotR Multivariate Analysis Using Biplots in R

NominalDistances: Distances among individuals with nominal variables In MultBiplotR: Multivariate Analysis Using Biplots in R

Distances among individuals with nominal variables

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to NominalDistances in MultBiplotR...

R Package Documentation

Browse R Packages

We want your feedback!

MultBiplotR
Multivariate Analysis Using Biplots in R

NominalDistances: Distances among individuals with nominal variables
In MultBiplotR: Multivariate Analysis Using Biplots in R