HVT: Constructing Hierarchical Voronoi Tessellations

View source: R/HVT.R

HVTR Documentation

Constructing Hierarchical Voronoi Tessellations

Description

Main function to construct hierarchical voronoi tessellations.

Usage

HVT(
  dataset,
  min_compression_perc = NA,
  n_cells = NA,
  depth = 1,
  quant.err = 0.2,
  projection.scale = 10,
  normalize = FALSE,
  distance_metric = c("L1_Norm", "L2_Norm"),
  error_metric = c("mean", "max"),
  quant_method = c("kmeans", "kmedoids"),
  scale_summary = NA,
  diagnose = FALSE,
  hvt_validation = FALSE,
  train_validation_split_ratio = 0.8
)

Arguments

dataset

Data frame. A data frame with different columns is given as input.

min_compression_perc

Numeric. An integer indicating the minimum percent compression rate to be achieved for the dataset

n_cells

Numeric. An integer indicating the number of cells per hierarchy (level)

depth

Numeric. An integer indicating the number of levels. (1 = No hierarchy, 2 = 2 levels, etc ...)

quant.err

Numeric. A number indicating the quantization error treshold.

projection.scale

Numeric. A number indicating the scale factor for the tesselations so as to visualize the sub-tesselations well enough.

normalize

Logical. A logical value indicating if the columns in your dataset should be normalized. Default value is TRUE.

distance_metric

character. The distance metric can be "L1_Norm" or "Manhattan". L1_Norm is selected by default.

error_metric

character. The error metric can be "mean" or "max". mean is selected by default

quant_method

character. The quant_method can be "kmeans" or "kmedoids". kmeans is selected by default

scale_summary

List. A list with mean and standard deviation values for all the features in the dataset. Pass the scale summary when the input dataset is already scaled or normalize is set to False.

diagnose

Logical. A logical value indicating if the diagnose is required. Default value is TRUE.

hvt_validation

Logical. A logical value indicating if the MAD values are to tested for validation set. Default value is FALSE.

train_validation_split_ratio

Numeric. A numeric value indicating the train and validation split ratio.

Details

This is the main function to construct hierarchical voronoi tessellations. The hvq function is called from this function. The output of the hvq function is hierarchical clustered data which will be the input for constructing tessellations. The data is then represented in 2d coordinates and the tessellations are plotted using these coordinates as centroids. For subsequent levels, transformation is performed on the 2d coordinates to get all the points within its parent tile. Tessellations are plotted using these transformed points as centroids. The lines in the tessellations are chopped in places so that they do not protrude outside the parent polygon. This is done for all the subsequent levels.

Value

A list that contains the hierarchical tesselation information. This list has to be given as input argument to plot the tessellations.

[[1]]

List. Information about the tesselation co-ordinates - level wise

[[2]]

List. Information about the polygon co-ordinates - level wise

[[3]]

List. Information about the hierarchical vector quantized data - level wise

[[4]]

List. Information about the model diagnosis- selected level

[[5]]

List. Information about the MAD values and percentage anomalies for validation dataset

Author(s)

Shubhra Prakash <shubhra.prakash@mu-sigma.com>, Sangeet Moy Das <sangeet.das@mu-sigma.com>, Shantanu Vaidya <shantanu.vaidya@mu-sigma.com>

See Also

plotHVT
hvtHmap

Examples

data(USArrests)
hvt.results <- list()
hvt.results <- HVT(USArrests, min_compression_perc = 70, quant.err = 0.2, 
                   distance_metric = "L1_Norm", error_metric = "mean",
                   projection.scale = 10, normalize = TRUE,
                   quant_method="kmeans")
plotHVT(hvt.results, line.width = c(0.8), color.vec = c('#141B41'), 
        maxDepth = 1)

hvt.results <- list()
hvt.results <- HVT(USArrests, n_cells = 15, depth = 3, quant.err = 0.2, 
                   distance_metric = "L1_Norm", error_metric = "mean",
                   projection.scale = 10, normalize = TRUE,
                   quant_method="kmeans")
plotHVT(hvt.results, line.width = c(1.2,0.8,0.4), color.vec = c('#141B41','#0582CA','#8BA0B4'), 
        maxDepth = 3)

muHVT documentation built on March 7, 2023, 6:38 p.m.