probKMA_wrap: Wrapper for the Probabilistic K-means Algorithm (ProbKMA)

View source: R/probKMA_wrap.R

probKMA_wrapR Documentation

Wrapper for the Probabilistic K-means Algorithm (ProbKMA)

Description

This function serves as a wrapper for the Probabilistic K-means Algorithm (ProbKMA) to cluster functional data. It handles preprocessing, parameter setup, and execution of the core algorithm, returning the results along with silhouette analysis to assess the clustering quality.

Usage

probKMA_wrap(
  Y0 = NULL,
  Y1 = NULL,
  P0 = matrix(),
  S0 = matrix(),
  standardize = FALSE,
  c_max = Inf,
  iter_max = 1000,
  iter4elong = 10,
  trials_elong = 10,
  return_options = TRUE,
  alpha = 0,
  max_gap = 0.2,
  quantile = 0.25,
  stopCriterion = "max",
  tol = 1e-08,
  tol4elong = 0.001,
  max_elong = 0.5,
  deltaJK_elong = 0.05,
  iter4clean = 50,
  tol4clean = 1e-04,
  m = 2,
  w = 1,
  seed = 1,
  K = 2,
  c = 40,
  quantile4clean = 1/K,
  exe_print = FALSE,
  set_seed = FALSE,
  diss = "d0_2",
  transformed = FALSE,
  V_init = NULL,
  align = TRUE,
  n_threads = 1
)

Arguments

Y0

A matrix of functional data for the first set of observations.

Y1

A matrix of functional data for the second set of observations.

P0

A matrix representing the initial membership probabilities.

S0

A matrix representing the initial shift parameters.

standardize

A logical value indicating whether to standardize the data. Default is 'FALSE'.

c_max

Maximum number of motifs to extract. Default is 'Inf'.

iter_max

Maximum number of iterations for the algorithm. Default is 1000.

iter4elong

Number of iterations for elongation. Default is 10.

trials_elong

Number of trials for elongation. Default is 10.

return_options

A logical value indicating whether to return additional options. Default is 'TRUE'.

alpha

A numeric value representing the weighting parameter. Default is 0.

max_gap

Maximum allowable gap between motifs. Default is 0.2.

quantile

Quantile to be used for cleaning. Default is 0.25.

stopCriterion

Stopping criterion for the algorithm, can be 'max' or other specified values. Default is 'max'.

tol

Tolerance for convergence. Default is 1e-8.

tol4elong

Tolerance for elongation iterations. Default is 1e-3.

max_elong

Maximum elongation allowed. Default is 0.5.

deltaJK_elong

Increment for the elongation. Default is 0.05.

iter4clean

Number of iterations for the cleaning process. Default is 50.

tol4clean

Tolerance for the cleaning process. Default is 1e-4.

m

Parameter controlling the clustering behavior. Default is 2.

w

Weighting parameter for the dissimilarity measure. Default is 1.

seed

Random seed for reproducibility. Default is 1.

K

Number of motifs to extract. Default is 2.

c

Minimum motif length. Default is 40.

quantile4clean

Quantile used for the cleaning process. Default is 1/K.

exe_print

A logical value indicating whether to print execution details. Default is 'FALSE'.

set_seed

A logical value indicating whether to set the random seed. Default is 'FALSE'.

diss

Dissimilarity measure to be used. Default is 'd0_2'.

transformed

A logical value indicating whether to normalize the curve segments to the interval [0,1] before applying the dissimilarity measure. Setting 'transformed = TRUE' scales each curve segment between 0 and 1, which allows for the identification of motifs with consistent shapes but different amplitudes. This normalization is useful for cases where motif occurrences may vary in amplitude but have similar shapes, enabling better pattern recognition across diverse data scales.

V_init

Initial values for the motifs. Default is 'NULL'.

align

A logical value indicating whether to align the curves. Default is 'TRUE'.

n_threads

Number of threads to use for parallel processing. Default is 1.

Value

A list containing:

probKMA_results

A list of results from the ProbKMA algorithm, including processed functional data and model parameters.

silhouette_results

Results from silhouette analysis, indicating the quality of the clustering.


funMoDisco documentation built on April 16, 2025, 1:10 a.m.