cosine_similarity: Function for calculating the cosine similarity

Description Usage Arguments Value Examples

View source: R/cluster-classifier.R

Description

Cosine similarity is a metric of similarity between two non-zero vectors of an inner product space that measures the cosine of the angle between them. Two vectors with the same orientation have a cosine similarity of 1, if they are perpendicular they have a similarity of 0, and if they have opposing directions the cosine similarity is -1, independent of their magnitude. One advantage of cosine similarity is its low-complexity, especially for sparse vectors where only the non-zero dimensions need to be considered, which is a common case in GSgalgoR. Other names of cosine similarity are Otuska-Orchini similarity when it is applied to binary data, which is the case for GSgalgoR, where individual solutions represented as strings of 0 and 1 are compared with t his metric.

Usage

1

Arguments

a, b

A string of numbers with equal length. It can also be two binary strings of 0's and 1's

Value

In practice, the function can return numeric values from -1 to 1 according the vector orientations, where a cosine similarity of 1 implies same orientation of the vectors while -1 imply vector of opposing directions. In the binary application, values range from 0 to 1, where 0 are totally discordant vectors while 1 are identical binary vectors.

Examples

1
2
3
4
5
6
7
solution1 <- c(1, 0, 0, 1, 0, 0, 1)
solution2 <- solution1
r <- cosine_similarity(solution1, solution2)
# the cosine similarity (r) equals 1
solution2 <- abs(solution1 - 1)
r2 <- cosine_similarity(solution1, solution2)
# the cosine similarity (r2) equals 0

harpomaxx/GSgalgoR documentation built on Oct. 25, 2020, 3:47 p.m.