emd_matrix: Generate Earth Mover's Distance Matrix

emd_matrixR Documentation

Generate Earth Mover's Distance Matrix

Description

Generate an Earth Mover's Distance Matrix for time series data distributions pairs out of a preprocessed time series data list.

Usage

emd_matrix(plist, parameter, maxIter, normalize)

Arguments

plist

List storing patient time series data (also see function: patient_list)

parameter

Parameter of interest to determine Earth Mover's Distances between distributions

maxIter

Maximum of iterations to calculate Earth Mover's Distance (default: 500)

normalize

Indicates if parameters delivered needs to be normalized or not (TRUE by default)

Details

the function may compute the EMD for each Patient ID pairi,j using the normalized distributions. EMD is a distance measure between two probability distributions over a region D. Informally, if the distributions are viewed as two distinct methods of accumulating a certain quantity of earth gravel across the region D. EMD is the smallest cost associated with converting one pile to another, where the cost is supposed to equal the quantity of material transferred multiplied by the distance traveled. A unit of labor is defined in this context as conveying a unit of earth across a unit of ground distance. A distribution may be described as a collection of clusters, each of which is defined by its mean or mode and the proportion of the distribution that belongs to it. This representation is referred to as the distribution’s signature. Both signatures may be of varying sizes. Simple distributions, for example, have lower signatures than complex distributions. (also see emd for further details)

Value

Earth Mover's Distance Square Matrix of type matrix

References

Yossi Rubner, Carlo Tomasi, and Leonidas J Guibas. A metric for distributions with applications to image databases. In Sixth International Conference on Computer Vi- sion (IEEE Cat. No. 98CH36271), pages 59–66. IEEE, 1998.

Examples

list <- patient_list(
"https://raw.githubusercontent.com/MrMaximumMax/FBCanalysis/master/demo/phys/data.csv",
GitHub = TRUE)
#Sampling frequency is supposed to be daily
matrix <- emd_matrix(list, "FEV1")


MrMaximumMax/FBCanalysis documentation built on June 23, 2022, 8:21 p.m.