Description Usage Arguments Value Examples
clustSimFunc takes as input a data.frame object or a data.frame object name (dfData
), a number (nClust
), a numeric vector (ylimPlot
) of two numbers indicating te lower and upper limits of the y-axis of the plot, a character string indicating the name of the output subdirectory (subDir
), a character string (main
) indicating the title of the plot, three positive numbers indicating the width (weight
), height (height
) and resolution (res
) of the output plot, and calculates the silhouette values for the number of "clusters" n in range 2 to nClust
(maximum 10), for the data in the data.frame object dfData
using the Gower clustering algorithm from the function daisy from package cluster (cluster::daisy). The silhouette values are for the number of clusters 2 to 10 are saved in a .txt file in the subdirectory subDir
inside the "output" directory within the current working directory. A plot showing the average silhouette width against the number of clusters (2 to nClust
) is saved as a .png file in the subdirectory subDir
inside the "plot" directory within the current working directory. "output" and/or "plot" directories are created in the current working directory if not present already. Similarly, if subDir
is specified, a subdirectory with the name subDir
is created within both output/ and plot/ if not already present, and the outputs are saved in that subdirectory. If a subdirectory is not specified (i.e. missing subDir
), then the output .txt file is saved in output/ and the plot is saved in plot/.
1 2 3 4 5 6 7 8 9 10 11 12 | clustSimFunc(
dfData,
nClust,
envir = .GlobalEnv,
ylimPlot = NULL,
subDir,
main = NULL,
width = 1200,
height = 600,
res = 125,
...
)
|
dfData |
a data.frame object or a character string indicating the name of the data.frame object. |
nClust |
a number indicating the number of clusters upto which the clustering is to be tested starting from number of clusters = 2. |
envir |
a variable indicating the environment where the output data.frame object should be saved. |
ylimPlot |
a numeric vector containing two values indicating the lower and upper limits of the y-axis. |
subDir |
a character string indicating the name of the subdirectory within "output" and "plot" directories to save the output data.frame object (as a .txt file) and plot (as a .png file) respectively. If a subdirectory with the given name does not exist within output and/or plot, then it is created. If not specified, the outputs are saved in output/ and plot/. |
main |
a character string (default: NULL) indicating an overall title for the plot. |
width |
a number (default: 1200) indicating the width of the output plot. |
height |
a number (default: 600) indicating the height of the output plot. |
res |
a number (default: 125) indicating the resolution of the output plot. |
clustSimFunc calculates the silhouette values for the number of "clusters" n in range 2 to nClust
(maximum 10), both inclusive, which are obtained for the data in the data.frame object dfData
using the Gower clustering algorithm from the function daisy from package cluster (cluster::daisy). It saves the silhouette values of 2 to 10 clusters in a .txt file saved in the "output" directory in the current working directory. It also creates a plot showing the average silhouette width against the number of clusters (2 to nClust
)considered for clustering and saves it as a .png file in the subDir
subdirectory within the directory "plot" inside the current working directory. It creates "output" and/or "plot" directories in the current working directory if not present already. Similarly, if subDir
is specified, it creates a subdirectory with the name subDir
within both output/ and plot/ if not already present, and saves the outputs in the respective subdirectories. If a subdirectory is not specified (i.e. missing subDir
), then it saves the output .txt file in output/ and the plot in plot/. It also saves the output data.frame object in the ". GlobalEnv" environment.
1 2 3 4 5 | tab1 = xlsx::read.xlsx("./sample-data.xlsx",sheetName = "data")
tab1Vars <- c("i..id" , "age" ,"area" , "paddArea" , "paddyFld" , "date")
tab1Var <- selectExclude(tab1,tab1Vars)
clustSimFunc(tab1Var,4)
clustSimFunc(tab1Var,4,,c(0,0.5))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.