estimateHellingerDiv: Hellinger divergence of methylation levels

estimateHellingerDivR Documentation

Hellinger divergence of methylation levels

Description

Given a the methylation levels of two individual, the function computes the information divergence between methylation levels.

Usage

estimateHellingerDiv(p, n = NULL)

Arguments

p

A numerical vector of the methylation levels p = c(p1, p2) of individuals 1 and 2.

n

if supplied, it is a vector of integers denoting the coverages used in the estimation of the methylation levels.

Details

The methylation level p_ij for an individual i at cytosine site j corresponds to a probability vector p^ij = (p_ij, 1 - p_ij). Then, the information divergence between methylation levels p^1j and p^2j from individuals 1 and 2 at site j is the divergence between the vectors p^1j = (p_1j, 1 - p_1j) and p^2j = (p_2j, 1 - p_2j). If the vector of coverage is supplied, then the information divergence is estimated according to the formula:

hdiv = 2*(n_1 + 1)*(n_2 + 1)*((sqrt(p_1j) - sqrt(p_2j))^2 + (sqrt(1 - p_1j) - sqrt(1 - p_2j))^2)/(n_1 + n_2 + 2)

This formula corresponds to Hellinger divergence as given in the first formula from Theorem 1 from reference 1. Otherwise:

hdiv = (sqrt(p_1j) - sqrt(p_2j))^2 + (sqrt(1 - p_1j) - sqrt(1 - p_2j))^2

Missing methylation levels, reported as NA or NaN, are replaced with zero.

Value

The Hellinger divergence value for the given methylation levels is returned

References

' 1. Basu A., Mandal A., Pardo L (2010) Hypothesis testing for two discrete populations based on the Hellinger distance. Stat Probab Lett 80: 206-214.

Examples

    p <- c(0.5, 0.5)
    estimateHellingerDiv(p)


genomaths/MethylIT documentation built on Feb. 3, 2024, 1:24 a.m.