sim_jaccard_cognate: Simulate random data removal for a removal amount with...

sim_jaccard_cognateR Documentation

Simulate random data removal for a removal amount with indicated number of simulations from time series data list and determine Jaccard index for all cluster via Cognate Cluster cluster assignment approach

Description

Simulate random data removal for a removal amount with indicated number of simulations from time series data list and determine Jaccard index for all cluster via Cognate Cluster cluster assignment approach

Usage

sim_jaccard_cognate(plist, parameter, removal, n_simu, method, n_clust, Iter)

Arguments

plist

Object of type list storing patient time series data (also see function: patient_list)

parameter

Parameter of interest in time series data list

removal

Amount of random data removal to determine Jaccard index

n_simu

Number of simulations

method

Clustering method (also see function: clust_matrix)

n_clust

Number of clusters (also see function: clust_matrix)

Iter

Maximum iterations to determine Earth Mover's Distances (also see function: emd_matrix); default is 5,000 for this function

Details

The cognate cluster approach works in the manner that first a Gold Standard cluster is determined meaning the cluster assignments without any data removal. Subsequently, random data is removed from the original, complete data and clustering is performed again on the leaky data. The cluster determined to be cognate to the Gold Standard cluster is the one with the highest overlap in cluster members, meaning hte cluster with highest acheived Jaccard index. Afterwards, the Jaccard indices are calculated, comparing cluster members with complete and leaky data, for each cluster.

Value

Object of type matrix storing received Jaccard indices for indicated amount of random data removal for all clusters

References

Anja Jochmann, Luca Artusio, Hoda Sharifian, Angela Jamalzadeh, Louise J Fleming, Andrew Bush, Urs Frey, and Edgar Delgado-Eckert. Fluctuation-based clustering reveals phenotypes of patients with different asthma severity. ERJ open research, 6(2), 2020.

Examples

list <- patient_list(
"https://raw.githubusercontent.com/MrMaximumMax/FBCanalysis/master/demo/phys/data.csv",
GitHub = TRUE)
#Sampling frequency is supposed to be daily
output <- sim_jaccard_cognate(list, "PEF", 0.05, 10, "hierarchical", 2, 1000)


MrMaximumMax/FBCanalysis documentation built on June 23, 2022, 8:21 p.m.