summ_entropy: Summarize distribution with entropy

View source: R/summ_entropy.R

summ_entropyR Documentation

Summarize distribution with entropy

Description

summ_entropy() computes entropy of single distribution while summ_entropy2() - for a pair of distributions. For "discrete" pdqr-functions a classic formula -sum(p * log(p)) (in nats) is used. In "continuous" case a differential entropy is computed.

Usage

summ_entropy(f)

summ_entropy2(f, g, method = "relative", clip = exp(-20))

Arguments

f

A pdqr-function representing distribution.

g

A pdqr-function. Should be the same type as f.

method

Entropy method for pair of distributions. One of "relative" (Kullback–Leibler divergence) or "cross" (for cross-entropy).

clip

Value to be used instead of 0 during log() computation. -log(clip) represents the maximum value of output entropy.

Details

Note that due to pdqr approximation error there can be a rather big error in entropy estimation in case original density goes to infinity.

Value

A single number representing entropy. If clip is strictly positive, then it will be finite.

See Also

Other summary functions: summ_center(), summ_classmetric(), summ_distance(), summ_hdr(), summ_interval(), summ_moment(), summ_order(), summ_prob_true(), summ_pval(), summ_quantile(), summ_roc(), summ_separation(), summ_spread()

Examples

d_norm <- as_d(dnorm)
d_norm_2 <- as_d(dnorm, mean = 2, sd = 0.5)

summ_entropy(d_norm)
summ_entropy2(d_norm, d_norm_2)
summ_entropy2(d_norm, d_norm_2, method = "cross")

# Increasing `clip` leads to decreasing maximum output value
d_1 <- new_d(1:10, "discrete")
d_2 <- new_d(20:21, "discrete")

## Formally, output isn't clearly defined because functions don't have the
## same support. Direct use of entropy formulas gives infinity output, but
## here maximum value is `-log(clip)`.
summ_entropy2(d_1, d_2, method = "cross")
summ_entropy2(d_1, d_2, method = "cross", clip = exp(-10))
summ_entropy2(d_1, d_2, method = "cross", clip = 0)

pdqr documentation built on May 31, 2023, 8:48 p.m.