clustermap: Classification of dataset using kmeans or hclust algorithm...

View source: R/clustermap.R

clustermapR Documentation

Classification of dataset using kmeans or hclust algorithm and representation of clusters on a map.

Description

The function clustermap() performs a classification of the sites from the variables called in names.var and computes a bar plot of the clusters calculated. Classification methods come from hclust() (hierarchical cluster analysis) and kmeans() (k-means clustering) and number of class is chosen with clustnum.

Usage

clustermap(sf.obj, names.var, clustnum, method = c("kmeans", "hclust"), 
  type = NULL, centers = NULL, scale = FALSE, names.arg = "", 
  criteria = NULL, carte = NULL, identify = NULL, 
  cex.lab = 0.8, pch = 16, col = "lightblue3", xlab = "Cluster", 
  ylab = "Number", axes = FALSE, lablong = "", lablat = "")

Arguments

sf.obj

object of class sf

names.var

a vector of character; attribute names or column numbers in attribute table

clustnum

integer, number of clusters

method

two methods : "kmeans"" by default or "hclust"

type

If method="hclust", type="complete" by default (the possibilities are given in help(hclust) as "ward", "single", etc). If method="kmeans", type="Hartigan-Wong" by default (the possibilities are given in help(kmeans) as "Forgy", etc)

centers

If method='kmeans', user can give a matrix with initial cluster centers.

scale

If scale=TRUE, the dataset is reducted.

names.arg

a vector of character, names of cluster

criteria

a vector of boolean of size the number of spatial units, which permit to represent preselected sites with a cross, using the tcltk window

carte

matrix with 2 columns for drawing spatial polygonal contours : x and y coordinates of the vertices of the polygon

identify

if not NULL, the name of the variable for identifying observations on the map

cex.lab

character size of label

pch

a vector of symbol which must be equal to the number of group else all sites are printed in pch[1]

col

a vector of colors which must be equal to the number of group else all sites and all bars are printed in col[1]

xlab

a title for the graphic x-axis

ylab

a title for the graphic y-axis

axes

a boolean with TRUE for drawing axes on the map

lablong

name of the x-axis that will be printed on the map

lablat

name of the y-axis that will be printed on the map

Details

The two windows are interactive : the sites selected by a bar chosen on the bar plot are represented on the map in red and the values of sites selected on the map by ‘points’ or ‘polygon’ are represented in red on the bar plot. The dendogram is also drawn for 'hclust' method. In option, possibility to choose the classification method.

Value

In the case where user click on save results button, a list is created as a global variable in last.select object. obs, a vector of integer, corresponds to the number of spatial units selected just before leaving the Tk window, vectclass, vector of integer, corresponds to the number of cluster attributed to each spatial unit.

Note

To use the functions hclust and kmeans, we take many arguments by default. If the user would like to modify these arguments, he should call these functions first and then use the function barmap to visualize the calculated clusters.

Author(s)

Thomas-Agnan C., Aragon Y., Ruiz-Gazen A., Laurent T., Robidou L.

References

Thibault Laurent, Anne Ruiz-Gazen, Christine Thomas-Agnan (2012), GeoXp: An R Package for Exploratory Spatial Data Analysis. Journal of Statistical Software, 47(2), 1-23.

Murtagh, F (1985). Multidimensional Clustering Algorithms.

Hartigan, J. A. and Wong, M. A. (1979). A K-means clustering algorithm. Applied Statistics 28, 100-108

Roger S.Bivand, Edzer J.Pebesma, Virgilio Gomez-Rubio (2009), Applied Spatial Data Analysis with R, Springer.

See Also

barmap, pcamap

Examples

#####
# data columbus
require("sf")
columbus <- st_read(system.file("shapes/columbus.shp", package="spData")[1])

# a basic example using the kmeans method
clustermap(columbus, c("HOVAL", "INC", "CRIME", "OPEN", "PLUMB", "DISCBD"), 3, 
  criteria = (columbus@data$CP == 1), identify = TRUE, cex.lab = 0.7)

## Not run:  
# example using the hclust method
clustermap(columbus, c(7:12), 3, method = "hclust", 
  criteria = (columbus$CP == 1), col = colors()[20:22], identify = "POLYID", 
  cex.lab = 0.7, names.arg = c("Group 1", "Group 2", "Group 3"), xlab = "Cluster")

## End(Not run)

tibo31/GeoXp documentation built on April 8, 2023, 7:50 a.m.