sim_jaccard_global: Simulate random data removal via Global Cognate Cluster...

View source: R/jaccard.R

sim_jaccard_globalR Documentation

Simulate random data removal via Global Cognate Cluster cluster assignment approach

Description

Simulate random data removal for a removal amount with indicated number of simulations from time series data list and determine Jaccard index for all cluster via Global Cognate Cluster cluster assignment approach

Usage

sim_jaccard_global(
  plist,
  parameter,
  removal,
  n_simu,
  method,
  n_clust,
  maxIter,
  normalize
)

Arguments

plist

Object of type list storing patient time series data (also see function: patient_list)

parameter

Parameter of interest in time series data list

removal

Amount of random data removal to determine Jaccard index

n_simu

Number of simulations

method

Clustering method (also see function: clust_matrix)

n_clust

Number of clusters (also see function: clust_matrix)

maxIter

Maximum iterations to determine Earth Mover's Distances (also see function: emd_matrix); default is 5,000 for this function

normalize

Indicates if parameter indicated needs to be normalized or not (TRUE by default)

Details

The cognate cluster approach works in the manner that first a Gold Standard cluster is determined meaning the cluster assignments without any data removal. Subsequently, random data is removed from the original, complete data and clustering is performed again on the leaky data. The cluster determined to be cognate to the Gold Standard cluster is the one with the highest overlap in cluster members, meaning hte cluster with highest acheived Jaccard index. Afterwards, the Jaccard indices are calculated, comparing cluster members with complete and leaky data, for each cluster.

Value

Object of type matrix storing received Jaccard indices for indicated amount of random data removal for all clusters

References

Anja Jochmann, Luca Artusio, Hoda Sharifian, Angela Jamalzadeh, Louise J Fleming, Andrew Bush, Urs Frey, and Edgar Delgado-Eckert. Fluctuation-based clustering reveals phenotypes of patients with different asthma severity. ERJ open research, 6(2), 2020.

Edgar Delgado-Eckert, Oliver Fuchs, Nitin Kumar, Juha Pekkanen, Jean-Charles Dalphin, Josef Riedler, Roger Lauener, Michael Kabesch, Maciej Kupczyk, Sven-Erik Dahlen, et al. Functional phenotypes determined by fluctuation-based clustering of lung function measurements in healthy and asthmatic cohort participants. Thorax, 73(2):107–115, 2018.

Examples

list <- patient_list(
"https://raw.githubusercontent.com/MrMaximumMax/FBCanalysis/master/demo/phys/data.csv",
GitHub = TRUE)
#Sampling frequency is supposed to be daily
output <- sim_jaccard_global(list, "PEF", 0.05, 10, "hierarchical", 2, 1000)


MrMaximumMax/FBCanalysis documentation built on June 23, 2022, 8:21 p.m.