reap_freq: Determine reappearance frequency

View source: R/jaccard.R

reap_freqR Documentation

Determine reappearance frequency

Description

Determine the reappearance frequency for clustered elements in perturbed data for specific amount of random data removal

Usage

reap_freq(plist, parameter, removal, n_simu, method, n_clust, Iter, normalize)

Arguments

plist

Object of type list storing patient time series data (also see function: patient_list)

parameter

Parameter of interest in time series data list

removal

Amount of random data removal to determine Jaccard index

n_simu

Number of simulations

method

Clustering method (also see function: clust_matrix)

n_clust

Number of clusters (also see function: clust_matrix)

Iter

Maximum iterations to determine Earth Mover's Distances (also see function: emd_matrix); default is 5,000 for this function

normalize

Indicates if parameter indicated needs to be normalized or not (TRUE by default)

Details

To begin, each participant's collection of measured values is randomly depleted of a certain proportion of measurements. The clustering technique is then performed using the perturbed data, and the resulting clusters are compared to the original clusters created with the unperturbed gold standard. This technique is done iteratively in order to provide statistics indicating the original clusters' stability following the elimination of random data. These stability statistics are calculated using two cluster similarity metrics: Jaccard's index, a measure of global similarity that quantifies the extent to which the original and modified clusters overlap. Additionally, it is utilized to decide which cluster is considered the cognate cluster. Then it was transformed into a local measure, meaning the frequency with which each member of the original clusters reappeared between iterations.

Value

Vector of length of n_simu where reappearance frequency is stored for each simulation run

References

Edgar Delgado-Eckert, Oliver Fuchs, Nitin Kumar, Juha Pekkanen, Jean-Charles Dalphin, Josef Riedler, Roger Lauener, Michael Kabesch, Maciej Kupczyk, Sven-Erik Dahlen, et al. Functional phenotypes determined by fluctuation-based clustering of lung function measurements in healthy and asthmatic cohort participants. Thorax, 73(2):107–115, 2018.


MrMaximumMax/FBCanalysis documentation built on June 23, 2022, 8:21 p.m.