clustFunc: clustFunc

Description Usage Arguments Value Examples

View source: R/clustFunc.R

Description

clustFunc takes as input two data.frame objects (dfSel and dfOriginal), a character string (optional) (addName) to be added to the output file name, a character string indicating the name of the output subdirectory (subDir), the number of clusters to be formed (n), three positive numbers indicating the width (width), height (height) and resolution (res) of the output plot and inserts two new columns in the data.frame object containing the cluster number assigned to each row. One column (called "clustN") is of class numeric, the other (called "clustF") is of class factor. The output data.frame object is saved as a .rda file in the subDir subdirectory within the directory "output" inside the current working directory. A silhouette plot of clusters is saved as a .png file in the subDir subdirectory within the directory "plot" inside the current working directory. "output" and/or "plot" directories are created in the current working directory if not present already. Similarly, if subDir is specified, a subdirectory with the name subDir is created within both output/ and plot/ if not already present, and the outputs are saved in that subdirectory. If a subdirectory is not specified (i.e. missing subDir), then the output .rda file is saved in output/ and the plot is saved in plot/. The output data.frame object is also saved in the ". GlobalEnv" environment.

Usage

1
clustFunc(dfSel, dfOriginal, addName = "Clust", subDir, n, ...)

Arguments

dfSel

a data.frame object or a character string indicating the name of the data.frame object. This is a subset of the data.frame object dfOriginal containing only the non-numeric columns of dfOriginal

dfOriginal

a data.frame object or a character string indicating the name of the data.frame object.

addName

a string (default: "Clust") to be added to the name of the output data.frame object and the output .rda file

subDir

a character string indicating the name of the subdirectory within "output" and "plot" directories to save the output data.frame object (as a .rda file) and plot (as a .png file) respectively. If a subdirectory with the given name does not exist within output and/or plot, then it is created. If not specified, the outputs are saved in output/ and plot/.

n

a number indicating the number of clusters to be formed on clustering.

Value

clustFunc updates the data.frame object dfOriginal by adding two new columns corresponding to the cluster number to which each row of the data.frame is assigned after clustering using the Gower clustering algorithm from the function daisy from package cluster (cluster::daisy). One column (called "clustN") is of class numeric, the other (called "clustF") is of class factor. It then saves this updated data.frame object as a .rda file in the subdirectory called subDir within the directory "output" inside the current working directory. It also creates a silhouette plot for n clusters and saves it as a .png file in the subDir subdirectory within the directory "plot" inside the current working directory. It creates "output" and/or "plot" directories in the current working directory if not present already. Similarly, if subDir is specified, it creates a subdirectory with the name subDir within both output/ and plot/ if not already present, and saves the outputs in the respective subdirectories. If a subdirectory is not specified (i.e. missing subDir), then it saves the output .rda file in output/ and the plot in plot/. It also saves the output data.frame object in the ". GlobalEnv" environment.

Examples

1
2
3
4
5
6
## Not run: 
tab1 = xlsx::read.xlsx("./inst/extdata/sample-data.xlsx",sheetName = "data")
tab1Vars <- c("i..id" , "age" ,"area" , "paddArea" , "paddyFld" , "date")
tab1Var <- selectExclude(tab1,tab1Vars)
clustFunc(tab1Var,tab1,,,3)
## End(Not run)

lwTools/agriTrf documentation built on March 26, 2020, 12:09 a.m.