getOptimalCentroids | R Documentation |

Get Optimal Centroids

getOptimalCentroids( x, iter.max, algorithm, n_cells, function_to_calculate_distance_metric, function_to_calculate_error_metric = c("mean", "max"), quant.err, distance_metric = "L1_Norm", quant_method = c("kmeans", "kmedoids"), ... )

`x` |
Data Frame. A dataframe of multivariate data. Each row corresponds to an observation, and each column corresponds to a variable. Missing values are not accepted. |

`algorithm` |
String. The type of algorithm used for quantization. Available algorithms are Hartigan and Wong, "Lloyd", "Forgy", "MacQueen". (default is "Hartigan-Wong") |

`n_cells` |
Numeric. Indicating the number of nodes per hierarchy. |

`function_to_calculate_distance_metric` |
Function. The function is to find 'L1_Norm" or "L2_Norm" distances. L1_Norm is selected by default. |

`function_to_calculate_error_metric` |
Character. The error metric can be "mean" or "max". mean is selected by default |

`quant.err` |
Numeric. The quantization error for the algorithm. |

`distance_metric` |
Character. The distance metric to calculate inter point distance. It can be 'L1_Norm" or "L2_Norm". L1_Norm is selected by default. |

`quant_method` |
Character. The quant_method can be "kmeans" or "kmedoids". kmeans is selected by default |

`depth` |
Numeric. Indicating the hierarchy depth (or) the depth of the tree (1 = no hierarchy, 2 = 2 levels, etc..) |

The raw data is first scaled and this scaled data is supplied as input to the vector quantization algorithm. Vector quantization technique uses a parameter called quantization error. This parameter acts as a threshold and determines the number of levels in the hierarchy. It means that, if there are 'n' number of levels in the hierarchy, then all the clusters formed till this level will have quantization error equal or greater than the threshold quantization error. The user can define the number of clusters in the first level of hierarchy and then each cluster in first level is sub-divided into the same number of clusters as there are in the first level. This process continues and each group is divided into smaller clusters as long as the threshold quantization error is met. The output of this technique will be hierarchically arranged vector quantized data.

`values` |
List. A list showing observations assigned to a cluster. |

`maxQE` |
List. A list corresponding to maximum QE values for each cell. |

`meanQE` |
List. A list corresponding to mean QE values for each cell. |

`centers` |
List. A list of quantization error for all levels and nodes. |

`nsize` |
List. A list corresponding to number of observations in respective groups. |

Shubhra Prakash <shubhra.prakash@mu-sigma.com>, Sangeet Moy Das <sangeet.das@mu-sigma.com>

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.