hvq: hvq

hvqR Documentation

hvq

Description

Hierarchical Vector Quantization

Usage

hvq(
  x,
  min_compression_perc = NA,
  n_cells = NA,
  depth = 3,
  quant.err = 10,
  algorithm = "Hartigan-Wong",
  distance_metric = c("L1_Norm", "L2_Norm"),
  error_metric = c("mean", "max"),
  quant_method = c("kmeans", "kmedoids")
)

Arguments

x

Data Frame. A dataframe of multivariate data. Each row corresponds to an observation, and each column corresponds to a variable. Missing values are not accepted.

min_compression_perc

Numeric. An integer indicating the minimum percent compression rate to be achieved for the dataset

n_cells

Numeric. Indicating the number of nodes per hierarchy.

depth

Numeric. Indicating the hierarchy depth (or) the depth of the tree (1 = no hierarchy, 2 = 2 levels, etc..)

quant.err

Numeric. The quantization error for the algorithm.

algorithm

String. The type of algorithm used for quantization. Available algorithms are Hartigan and Wong, "Lloyd", "Forgy", "MacQueen". (default is "Hartigan-Wong")

distance_metric

character. The distance metric can be 'L1_Norm" or "L2_Norm". L1_Norm is selected by default.

error_metric

character. The error metric can be "mean" or "max". mean is selected by default

quant_method

character. The quant_method can be "kmeans" or "kmedoids". kmeans is selected by default

Details

The raw data is first scaled and this scaled data is supplied as input to the vector quantization algorithm. Vector quantization technique uses a parameter called quantization error. This parameter acts as a threshold and determines the number of levels in the hierarchy. It means that, if there are 'n' number of levels in the hierarchy, then all the clusters formed till this level will have quantization error equal or greater than the threshold quantization error. The user can define the number of clusters in the first level of hierarchy and then each cluster in first level is sub-divided into the same number of clusters as there are in the first level. This process continues and each group is divided into smaller clusters as long as the threshold quantization error is met. The output of this technique will be hierarchically arranged vector quantized data.

Value

clusters

List. A list showing each ID assigned to a cluster.

nodes.clust

List. A list corresponding to nodes' details.

idnodes

List. A list of ID and segments similar to nodes.clust with additional columns for nodes ID.

error.quant

List. A list of quantization error for all levels and nodes.

plt.clust

List. A list of logical values indicating if the quantization error was met.

summary

Summary. Output table with summary.

Author(s)

Shubhra Prakash <shubhra.prakash@mu-sigma.com>, Sangeet Moy Das <sangeet.das@mu-sigma.com>

See Also

hvtHmap

Examples


data("USArrests",package="datasets")
hvqOutput = hvq(USArrests, n_cells = 5, depth = 2, quant.err = 0.2,
distance_metric='L1_Norm',error_metric='mean',quant_method="kmeans")


muHVT documentation built on March 7, 2023, 6:38 p.m.