pca.KMeans: pca.KMeans

Description Format Details Active bindings Methods Author(s) Examples

Description

Performs KMeans clustering on a given pca dataset.

Format

[R6::R6Class] object.

Details

Performs KMeans clustering on a given pca dataset.

Active bindings

seed

Returns the instance variable seed (integer)

setSeed

Sets the instance variable seed (integer)

nCenters

Returns the instance variable nCenters (integer)

setNCenters

Sets the instance variable nCenters (integer)

level

Returns the instancs variable level (character)

predClass

Returns the instance variable predClass (factor)

df_centers

Returns the instance variable df_centers (tbl_df)

df_silhouette

Returns the instance variable df_silhouette (tbl_df)

av_sil_width

Returns the instance variable av_sil_width (numeric)

av_withinss

Returns the instance variable av_withinss (tbl_df)

tot_withinss

Returns the instance variable tot_withinss (numeric)

verbose

Returns the instance variable verbose (logical)

Methods

Public methods


Method new()

Creates and returns a new pca.KMeans object.

Usage
pca.KMeans$new(n = 2, seed = 42, verbose = FALSE)
Arguments
n

Initial number of cluster (integer)

seed

An initial seed. Default is 42 (integer)

verbose

Makes the class chatty. Default is FALSE. (logical)

Returns

A new R6 object of type pca.KMeans. (pguXAI::pca.KMeans)


Method finalize()

Clears the heap and indicates that instance of pca.KMeans is removed from heap.

Usage
pca.KMeans$finalize()

Method print()

Prints instance variables of a pca.KMeans object.

Usage
pca.KMeans$print()
Returns

string


Method train()

trains the model

Usage
pca.KMeans$train(obj = "tbl_df")
Arguments
obj

The data to be analyzed. Needs to be the result of a pca analysis. (tibble::tibble)


Method cluster_statistics()

Performs cluster analysis step.. Not to run by the user.

Usage
pca.KMeans$cluster_statistics(obj = "tbl_df")
Arguments
obj

The data to be analyzed. Needs to be the result of a pca analysis. (tibble::tibble)


Method silhouette_analysis()

Performs a silouette analysis. Not to run by the user.

Usage
pca.KMeans$silhouette_analysis(obj = "tbl_df")
Arguments
obj

The data to be analyzed. Needs to be the result of a pca analysis. (tibble::tibble)


Method cluster_plot()

Plots Clustering Result in all pca dimensions

Usage
pca.KMeans$cluster_plot(obj = "tbl_df")
Arguments
obj

The data to be analyzed. Needs to be the result of a pca analysis. (tibble::tibble)

Returns

(list)


Method silhouette_plot()

Plots Silhouette analysis

Usage
pca.KMeans$silhouette_plot(obj = "tbl_df")
Arguments
obj

The data to be analyzed. Needs to be the result of a pca analysis. (tibble::tibble)

Returns

(list)


Method clone()

The objects of this class are cloneable with this method.

Usage
pca.KMeans$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Author(s)

Sebastian Malkusch, malkusch@med.uni-frankfurt.de

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
library(tidyverse)
library(pguXAI)
library(FactoMineR)
library(caret)

main = function(){
  # load data set and remove class labels
  df_data <- iris %>%
    dplyr::select(-Species)

  # define true class labels
  classes_true <- iris$Species

  # define nuber of components for pca and number of clusters for kmeans
  nComponents <- 2
  nCluster <- 10

  # pre-scale the data for pca
  PreProcessor <- caret::preProcess(x=df_data, method=c("center", "scale"), pcaComp = nComponents)
  df_scaled <- predict(PreProcessor, df_data)

  # reduce dimensions of sclaed dataset using pca
  rslt_pca <- df_scaled %>%
    FactoMineR::PCA(ncp = nComponents, scale.unit = FALSE, graph = FALSE)
  df_pred <- as.data.frame(predict(rslt_pca, df_scaled)$coord)

  # run kmeans analysis
  km <- pguXAI::pca.KMeans$new(n=nCluster, seed = 42, verbose = TRUE)
  km$train(obj = df_pred)

  km$cluster_plot(obj = df_pred)

  km$silhouette_plot(obj = df_pred) %>%
    plot()

  print("Result of silhouette analysis:")
  km$df_silhouette %>%
    print()

  print("Average silhouette width:")
  km$av_sil_width %>%
    print()

  print("Centers of clusters:")
  km$df_centers %>%
    print()

  print("Probability of the class label assignment:")
  km$predProb %>%
    print()

  print("Majority vote of the class label assignment:")
  km$predClass %>%
    print()

  print("Within cluster sum of squares analysis:")
  km$df_withinss %>%
    print()

  km$tot_withinss %>%
    print()

  fin <- "done"
  fin
}

main()

SMLMS/pguXAI documentation built on Aug. 15, 2020, 7:09 a.m.