row_dists: Measure matrix row distances

row_distsR Documentation

Measure matrix row distances

Description

In topic modeling, we generally deal with matrices whose rows represent probability distributions. To find the distance between such distributions, we normally do not use the Euclidean distance or the other options supplied by dist. This is a utility function for taking a matrix with K rows and producing the matrix of distances between those rows, given an arbitrary metric. It is not fast. On matrices with very many columns, R may thrash endlessly; it is recommended that you drop most columns before calculating distances with this (usually keeping only, say, the thousand columns with the largest total weight does not affect the results unduly).

Usage

row_dists(x, g = JS_divergence)

Arguments

x

matrix

g

metric (function of two vectors). J-S divergence by default

Details

The flexmix package supplies a K-L divergence function KLdiv, but I have not found an implementation of the symmetrized Jensen-Shannon divergence, so I have supplied one in JS_divergence.

Value

matrix of distances between rows

See Also

topic_divergences, JS_divergence, doc_topic_cor,


agoldst/dfrtopics documentation built on July 15, 2022, 4:13 p.m.