gJSD: Generalized Jensen-Shannon Divergence
In philentropy: Similarity and Distance Quantification Between Probability Functions

View source: R/gJSD.R

gJSD	R Documentation

Generalized Jensen-Shannon Divergence

Description

This function computes the Generalized Jensen-Shannon Divergence of a probability matrix.

Usage

gJSD(x, unit = "log2", weights = NULL, est.prob = NULL)

Arguments

`x`	a probability matrix.
`unit`	a character string specifying the logarithm unit that shall be used to compute distances that depend on log computations.
`weights`	a numeric vector specifying the weights for each distribution in `x`. Default: `weights` = `NULL`; in this case all distributions are weighted equally (= uniform distribution of weights). In case users wish to specify non-uniform weights for e.g. 3 distributions, they can specify the argument `weights = c(0.5, 0.25, 0.25)`. This notation denotes that `vec1` is weighted by `0.5`, `vec2` is weighted by `0.25`, and `vec3` is weighted by `0.25` as well.
`est.prob`	method to estimate probabilities from input count vectors such as non-probability vectors. Default: `est.prob = NULL`. Options are: `est.prob = "empirical"`: The relative frequencies of each vector are computed internally. For example an input matrix `rbind(1:10, 11:20)` will be transformed to a probability vector `rbind(1:10 / sum(1:10), 11:20 / sum(11:20))`

Details

Function to compute the Generalized Jensen-Shannon Divergence

JSD_{\pi_1,...,\pi_n}(P_1, ..., P_n) = H(\sum_{i = 1}^n \pi_i * P_i) - \sum_{i = 1}^n \pi_i*H(P_i)

where \pi_1,...,\pi_n denote the weights selected for the probability vectors P_1,...,P_n and H(P_i) denotes the Shannon Entropy of probability vector P_i.

Value

The Jensen-Shannon divergence between all possible combinations of comparisons.

Author(s)

Hajk-Georg Drost

Examples

# define input probability matrix
Prob <- rbind(1:10/sum(1:10), 20:29/sum(20:29), 30:39/sum(30:39))

# compute the Generalized JSD comparing the PS probability matrix
gJSD(Prob)

# Generalized Jensen-Shannon Divergence between three vectors using different log bases
gJSD(Prob, unit = "log2") # Default
gJSD(Prob, unit = "log")
gJSD(Prob, unit = "log10")

# Jensen-Shannon Divergence Divergence between count vectors P.count and Q.count
P.count <- 1:10
Q.count <- 20:29
R.count <- 30:39
x.count <- rbind(P.count, Q.count, R.count)
gJSD(x.count, est.prob = "empirical")

philentropy documentation built on April 3, 2025, 10:33 p.m.