hclustgeo: Ward clustering with soft contiguity contraints In ClustGeo: Hierarchical Clustering with Spatial Constraints

Description

Implements a Ward-like hierarchical clustering algorithm including soft contiguity constraints. The algorithm takes as input two dissimilarity matrices `D0` and `D1` and a mixing parameter alpha between 0 an 1. The dissimilarities can be non euclidean and the weights of the observations can be non uniform. The first matrix gives the dissimilarities in the "feature space". The second matrix gives the dissimilarities in the "constraint" space. For instance, `D1` can be a matrix of geographical distances or a matrix build from a contiguity matrix. The mixing parameter `alpha` sets the importance of the constraint in the clustering process.

Usage

 `1` ```hclustgeo(D0, D1 = NULL, alpha = 0, scale = TRUE, wt = NULL) ```

Arguments

 `D0` an object of class `dist` with the dissimilarities between the n observations. The function `as.dist` can be used to transform an object of class `matrix` to object of class `dist`. `D1` an object of class "dist" with other dissimilarities between the same n observations. `alpha` a real value between 0 and 1. This mixing parameter gives the relative importance of `D0` compared to `D1`. By default, this parameter is equal to 0 and `D0` is used alone in the clustering process. `scale` if TRUE the two dissimilarity matric `D0` and `D1` are scaled i.e. divided by their max. If `D1`=NULL, this parameter is no used and D0 is not scaled. `wt` vector with the weights of the observations. By default, wt=NULL corresponds to the case where all observations are weighted by 1/n.

Details

The criterion minimized at each stage is a convex combination of the homogeneity criterion calculated with `D0` and the homogeneity criterion calculated with `D1`. The parameter `alpha` (the weight of this convex combination) controls the importance of the constraint in the quality of the solutions. When `alpha` increases, the homogeneity calculated with `D0` decreases whereas the homogeneity calculated with `D1` increases.

Value

Returns an object of class `hclust`.

References

M. Chavent, V. Kuentz-Simonet, A. Labenne, J. Saracco. ClustGeo: an R package for hierarchical clustering with spatial constraints. Comput Stat (2018) 33: 1799-1822.

`choicealpha`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20``` ```data(estuary) # with one dissimilarity matrix w <- estuary\$map@data\$POPULATION # non uniform weights D <- dist(estuary\$dat) tree <- hclustgeo(D,wt=w) sum(tree\$height) inertdiss(D,wt=w) inert(estuary\$dat,w=w) plot(tree,labels=FALSE) part <- cutree(tree,k=5) sp::plot(estuary\$map, border = "grey", col = part) # with two dissimilarity matrix D0 <- dist(estuary\$dat) # the socio-demographic distances D1 <- as.dist(estuary\$D.geo) # the geographical distances alpha <- 0.2 # the mixing parameter tree <- hclustgeo(D0,D1,alpha=alpha,wt=w) plot(tree,labels=FALSE) part <- cutree(tree,k=5) sp::plot(estuary\$map, border = "grey", col = part) ```