ds.nbclust: Determines optimal number of clusters for dataset

View source: R/ds.nbclust.R

ds.nbclustR Documentation

Determines optimal number of clusters for dataset

Description

This function is similar to the R function 'NBClust' from the NBClust package

Usage

ds.nbclust(
  df.name = NULL,
  diss = NULL,
  distance = "euclidean",
  min.nc = 2,
  max.nc = 15,
  method = NULL,
  index = "all",
  alphaBeale = 0.1,
  seed = 123,
  datasources = NULL
)

Arguments

df.name

is a string character of the data set and can be either a matrix or data frame

diss

is a dissimilarity structure which will be calculated according to the distance method

distance

specifies the method for the distance matrix calculation and can be either 'euclidean', 'maximum', 'manhattan', 'canberra', 'binary' or 'minkowski'

min.nc

specifies the minimum number of clusters

max.nc

specifies the maximum number of clusters

method

describes the clustering method and can be either "ward.D2", "single", "complete", "average", "mcquitty", "median", "centroid", "kmeans" or "ward.D"

index

describes the clustering index and can be either "kl", "ch", "hartigan", "ccc", "scott", "marriot", "trcovw", "tracew", "friedman", "rubin", "cindex", "db", "silhouette", "duda", "pseudot2", "beale", "ratkowsky", "ball", "ptbiserial", "gap", "frey", "mcclain", "gamma", "gplus", "tau", "dunn", "hubert", "sdindex", "dindex", "sdbw", "all" or "alllong"

alphaBeale

value for "beale" clustering index

seed

is an integer for random start point

datasources

is a DSConnection object

Details

The function uses partitioning methods to find optimal numbers of clusters for a given dataset.

Value

a summary suggesting the optimal number of clusters

Author(s)

Florian Schwarz for the German Institute of Human Nutrition


FlorianSchw/dsClusterAnalysisClient documentation built on Feb. 8, 2025, 10:32 a.m.