jaccard_run_emd_2: Simulate random data removal range via alternative Earth...

View source: R/jaccard.R

jaccard_run_emd_2R Documentation

Simulate random data removal range via alternative Earth Mover's Distance Cognate Cluster cluster assignment approach

Description

Alternative: Simulate amount of random data removal from time series data list and determine Jaccard index via Earth Mover's Distance approach for multiple random data removal steps for a specific cluster of interest.

Usage

jaccard_run_emd_2(
  plist,
  parameter,
  n_simu,
  method,
  clust_num,
  n_clust,
  range,
  maxIter,
  normalize
)

Arguments

plist

Object of type list storing patient time series data (also see function: patient_list)

parameter

Parameter of interest in time series data list

n_simu

Number of simulations

method

Clustering method (also see function: clust_matrix)

clust_num

Cluster of interest

n_clust

Number of clusters

range

Range to simulate random data removal (e.g. c(0.1,0.2,0.5,0.7,0.8))

maxIter

Maximum iterations to determine Earth Mover's Distances (also see function: emd_matrix); default is 5,000 for this function

normalize

Indicates if parameter indicated needs to be normalized or not (TRUE by default)

Details

See sim_jaccard_emd_2 for more detailed approach on Jaccard index determination. The difference in this function is that now only one cluster is observed für multiple amounts of random data removal where for each data removal step defined the resulting Jaccard indices are stored in a list object. Furthermore, a boxplot visualization is generated, in the style of recent publications.

Value

Object of type list storing Jaccard indices for each indicated random data removal step and visualized results in a boxplot

References

Yossi Rubner, Carlo Tomasi, and Leonidas J Guibas. A metric for distributions with applications to image databases. In Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), pages 59–66. IEEE, 1998.

Examples

list <- patient_list(
"https://raw.githubusercontent.com/MrMaximumMax/FBCanalysis/master/demo/phys/data.csv",
GitHub = TRUE)
#Sampling frequency is supposed to be daily
output <- jaccard_run_emd_2(list,"PEF",10,"hierarchical",1,3,c(0.005,0.01,0.05,0.1,0.2))



MrMaximumMax/FBCanalysis documentation built on June 23, 2022, 8:21 p.m.