knitr::opts_chunk$set(echo = TRUE)
This vignette explains how proxyC compute the similarity and distance measures.
$$ \vec{x} = [x_i, x_{i + 1}, \dots, x_n] \ \vec{y} = [y_i, y_{i + 1}, \dots, y_n] $$ The length of the vector $n = ||\vec{x}||$, while $|\vec{x}|$ is the absolute values of the elements.
Operations on vectors are element-wise:
$$ \vec{z} = \vec{x}\vec{y} \ n = ||\vec{x}|| = ||\vec{y}|| =||\vec{z}|| $$
Summation of the elements of vectors is written using sigma without specifying the range:
$$ \sum{\vec{x}} = \sum_{i=1}^{n}{x_i} $$
When the elements of the vector is compared with a value in a pair of square brackets, the summation is counting the number of elements that equal (or unequal) to the value:
$$ \sum{[\vec{x} = 1]} = \sum_{i=1}^{n}{[x_i = 1]} $$
Similarity measures are available in proxyC::simil()
.
$$ simil = \frac{\sum{\vec{x}\vec{y}}}{\sqrt{\sum{\vec{x} ^ 2}} \sqrt{\sum{\vec{y} ^ 2}}} $$
$$ simil = \frac{Cov(\vec{x},\vec{y})}{Var(\vec{x}) Var(\vec{y})} $$
The values of $x$ and $y$ are Boolean for "jaccard".
$$ e = \sum{\vec{x} \vec{y}} \ w = \text{user-provided weight} \ simil = \frac{e}{\sum{\vec{x} ^ w} + \sum{\vec{y} ^ w} - e} $$
The values must be $0 \le x \le 1.0$ and $0 \le y \le 1.0$.
$$ simil = \frac{\sum{min(\vec{x}, \vec{y})}}{\sum{max(\vec{x}, \vec{y})}} $$
The values of $x$ and $y$ are Boolean for "dice".
$$ e = \sum{\vec{x} \vec{y}} \ w = \text{user-provided weight} \ simil = \frac{2 e}{\sum{\vec{x} ^ w} + \sum{\vec{y} ^ w}} $$
$$ e = \sum{\vec{x} \vec{y}} \ n = ||\vec{x}|| = ||\vec{y}|| \ u = n - e \ simil = \frac{e - u}{e + u} $$
$$ t = \sum{[\vec{x} = 1][\vec{y} = 1]} \ f = \sum{[\vec{x} = 0][\vec{y} = 0]} \ n = ||\vec{x}|| = ||\vec{y}|| \ simil = \frac{t + 0.5 f}{n} $$
$$ simil = \sum{[\vec{x} = \vec{y}]} $$
Similarity measures are available in proxyC::dist()
. Smoothing of the vectors can be performed when method
is "chisquared", "kullback", "jefferys" or "jensen": the value of smooth
will be added to each element of $\vec{x}$ and $\vec{y}$.
$$ dist = \sum{|\vec{x} - \vec{y}|} $$
$$ dist = \frac{|\vec{x} - \vec{y}|}{|\vec{x}| + |\vec{y}|} $$
$$ dist = \sum{\sqrt{\vec{x}^2 + \vec{y}^2}} $$
$$ p = \text{user-provided parameter} \ dist = \Bigl( \sum{|\vec{x} - \vec{y}| ^ p} \Bigr) ^ \frac{1}{p} $$
$$ dist = \sum{[\vec{x} \ne \vec{y}]} $$
$$ dist = \max{\vec{x} - \vec{y}} $$
$$ O_{ij} = \text{augmented matrix from } \vec{x} \text{ and } \vec{y} \ E_{ij} = \text{matrix of expected count for } O_{ij} \ dist = \sum{\frac{(O_{ij} - E_{ij}) ^ 2}{ E_{ij}}} \ $$
$$ \vec{p} = \frac{\vec{x}}{\sum{\vec{x}}} \ \vec{q} = \frac{\vec{y}}{\sum{\vec{y}}} \ dist = \sum{\vec{q} \log_2{\frac{\vec{q}}{\vec{p}}}} $$
$$ \vec{p} = \frac{\vec{x}}{\sum{\vec{x}}} \ \vec{q} = \frac{\vec{y}}{\sum{\vec{y}}} \ dist = \sum{\vec{q} \log_2{\frac{\vec{q}}{\vec{p}}}} + \sum{\vec{p} \log_2{\frac{\vec{p}}{\vec{q}}}} $$
$$ \vec{p} = \frac{\vec{x}}{\sum{\vec{x}}} \ \vec{q} = \frac{\vec{y}}{\sum{\vec{y}}} \ \vec{m} = \frac{1}{2} (\vec{p} + \vec{q}) \ dist = \frac{1}{2} \sum{\vec{q} \log_2{\frac{\vec{q}}{\vec{m}}}} + \frac{1}{2} \sum{\vec{p} \log_2{\frac{\vec{p}}{\vec{m}}}} $$
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.