row_dists | R Documentation |
In topic modeling, we generally deal with matrices whose rows
represent probability distributions. To find the distance between
such distributions, we normally do not use the Euclidean distance or
the other options supplied by dist
. This is a
utility function for taking a matrix with K rows and producing
the matrix of distances between those rows, given an arbitrary
metric. It is not fast. On matrices with very many columns, R may
thrash endlessly; it is recommended that you drop most columns before
calculating distances with this (usually keeping only, say, the
thousand columns with the largest total weight does not affect the
results unduly).
row_dists(x, g = JS_divergence)
x |
matrix |
g |
metric (function of two vectors). J-S divergence by default |
The flexmix package supplies a K-L divergence function KLdiv
,
but I have not found an implementation of the symmetrized Jensen-Shannon
divergence, so I have supplied one in JS_divergence
.
matrix of distances between rows
topic_divergences
, JS_divergence
, doc_topic_cor
,
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.