probKMA_wrap: Wrapper for the Probabilistic K-means Algorithm (ProbKMA)
In funMoDisco: Motif Discovery in Functional Data

probKMA_wrap

R Documentation

Wrapper for the Probabilistic K-means Algorithm (ProbKMA)

Description

This function serves as a wrapper for the Probabilistic K-means Algorithm (ProbKMA) to cluster functional data. It handles preprocessing, parameter setup, and execution of the core algorithm, returning the results along with silhouette analysis to assess the clustering quality.

Usage

probKMA_wrap(
  Y0 = NULL,
  Y1 = NULL,
  P0 = matrix(),
  S0 = matrix(),
  standardize = FALSE,
  c_max = Inf,
  iter_max = 1000,
  iter4elong = 10,
  trials_elong = 10,
  return_options = TRUE,
  alpha = 0,
  max_gap = 0.2,
  quantile = 0.25,
  stopCriterion = "max",
  tol = 1e-08,
  tol4elong = 0.001,
  max_elong = 0.5,
  deltaJK_elong = 0.05,
  iter4clean = 50,
  tol4clean = 1e-04,
  m = 2,
  w = 1,
  seed = 1,
  K = 2,
  c = 40,
  quantile4clean = 1/K,
  exe_print = FALSE,
  set_seed = FALSE,
  diss = "d0_2",
  transformed = FALSE,
  V_init = NULL,
  align = TRUE,
  n_threads = 1
)

Arguments

`Y0`	A matrix of functional data for the first set of observations.
`Y1`	A matrix of functional data for the second set of observations.
`P0`	A matrix representing the initial membership probabilities.
`S0`	A matrix representing the initial shift parameters.
`standardize`	A logical value indicating whether to standardize the data. Default is 'FALSE'.
`c_max`	Maximum number of motifs to extract. Default is 'Inf'.
`iter_max`	Maximum number of iterations for the algorithm. Default is 1000.
`iter4elong`	Number of iterations for elongation. Default is 10.
`trials_elong`	Number of trials for elongation. Default is 10.
`return_options`	A logical value indicating whether to return additional options. Default is 'TRUE'.
`alpha`	A numeric value representing the weighting parameter. Default is 0.
`max_gap`	Maximum allowable gap between motifs. Default is 0.2.
`quantile`	Quantile to be used for cleaning. Default is 0.25.
`stopCriterion`	Stopping criterion for the algorithm, can be 'max' or other specified values. Default is 'max'.
`tol`	Tolerance for convergence. Default is 1e-8.
`tol4elong`	Tolerance for elongation iterations. Default is 1e-3.
`max_elong`	Maximum elongation allowed. Default is 0.5.
`deltaJK_elong`	Increment for the elongation. Default is 0.05.
`iter4clean`	Number of iterations for the cleaning process. Default is 50.
`tol4clean`	Tolerance for the cleaning process. Default is 1e-4.
`m`	Parameter controlling the clustering behavior. Default is 2.
`w`	Weighting parameter for the dissimilarity measure. Default is 1.
`seed`	Random seed for reproducibility. Default is 1.
`K`	Number of motifs to extract. Default is 2.
`c`	Minimum motif length. Default is 40.
`quantile4clean`	Quantile used for the cleaning process. Default is 1/K.
`exe_print`	A logical value indicating whether to print execution details. Default is 'FALSE'.
`set_seed`	A logical value indicating whether to set the random seed. Default is 'FALSE'.
`diss`	Dissimilarity measure to be used. Default is 'd0_2'.
`transformed`	A logical value indicating whether to normalize the curve segments to the interval [0,1] before applying the dissimilarity measure. Setting 'transformed = TRUE' scales each curve segment between 0 and 1, which allows for the identification of motifs with consistent shapes but different amplitudes. This normalization is useful for cases where motif occurrences may vary in amplitude but have similar shapes, enabling better pattern recognition across diverse data scales.
`V_init`	Initial values for the motifs. Default is 'NULL'.
`align`	A logical value indicating whether to align the curves. Default is 'TRUE'.
`n_threads`	Number of threads to use for parallel processing. Default is 1.

Value

A list containing:

`probKMA_results`	A list of results from the ProbKMA algorithm, including processed functional data and model parameters.
`silhouette_results`	Results from silhouette analysis, indicating the quality of the clustering.