clustering_partition: Obtain a partition of the spatial domain using the...

View source: R/clustering_partition.R

clustering_partitionR Documentation

Obtain a partition of the spatial domain using the density-based spatial clustering (DBSC) algorithm described in Santafé et al. (2021)

Description

The function takes an object of class SpatialPolygonsDataFrame or sf and defines a spatial partition using the DBSC algorithm described in \insertCitesantafe2021;textualbigDM.

Usage

clustering_partition(
  carto,
  ID.area = NULL,
  var = NULL,
  n.cluster = 10,
  min.size = NULL,
  W = NULL,
  l = 1,
  Wk = NULL,
  distance = "euclidean",
  verbose = TRUE
)

Arguments

carto

object of class SpatialPolygonsDataFrame or sf.

ID.area

character; name of the variable that contains the IDs of spatial areal units.

var

character; name of the variable that contains the data of interest to compute spatial clusters, usually the vector of log-SMR.

n.cluster

numeric; value to fix the number of cluster centers in the DBSC algorithm. Default to 10.

min.size

numeric (default NULL); value to fix the minimum size of areas in each spatial partition.

W

optional argument with the binary adjacency matrix of the spatial areal units. If NULL (default), this object is computed from the carto argument (two areas are considered as neighbours if they share a common border).

l

numeric value with the neighbourhood order used to assign areas to each cluster. If k=1 (default), only areas that share a common border are considered.

Wk

previously computed binary adjacency matrix of l-order neighbours. If this argument is included (default NULL), the parameter l is ignored.

distance

the distance measure to be used (default "euclidean"). See the method argument of dist function for other options.

verbose

logical value (default TRUE); indicates if the function runs in verbose mode.

Details

The DBSC algorithm implemented in this function is a new spatial clustering algorithm based on the density clustering algorithm introduced by \insertCiterodriguez2014clustering;textualbigDM and the posterior modification presented by \insertCitewang2016automatic;textualbigDM. This algorithm is able to obtain a single clustering partition of the data by automatically detecting clustering centers and assigning each area to its nearest cluster centroid. The algorithm has its basis in the assumption that cluster centers are points with high local density and relatively large distance to other points with higher local densities. See \insertCitesantafe2021;textualbigDM for more details.

Value

sf object with the original data and a grouping variable named 'ID.group'.

References

\insertRef

rodriguez2014clusteringbigDM

\insertRef

santafe2021bigDM

\insertRef

wang2016automaticbigDM

Examples

## Not run: 
library(sf)
library(tmap)

## Load the Spain colorectal cancer mortality data ##
data(Carto_SpainMUN)

## Define a spatial partition using the DBSC algorithm ##
Carto_SpainMUN$logSMR <- log(Carto_SpainMUN$obs/Carto_SpainMUN$exp+0.0001)

carto.new <- clustering_partition(carto=Carto_SpainMUN, ID.area="ID", var="logSMR",
                                  n.cluster=20, l=2, min.size=100, verbose=TRUE)
table(carto.new$ID.group)

## Plot of the grouping variable 'ID.group' ##
carto.data <- st_set_geometry(carto.new, NULL)
carto.partition <- aggregate(carto.new[,"geometry"], list(ID.group=carto.data[,"ID.group"]), head)

tmap4 <- packageVersion("tmap") >= "3.99"

if(tmap4){
        tm_shape(carto.new) +
                tm_polygons(fill="ID.group", fill.scale=tm_scale(values="brewer.set3")) +
                tm_shape(carto.partition) +
                tm_borders(col="black", lwd=2) +
                tm_layout(legend.outside=TRUE, legend.frame=FALSE)
}else{
        tm_shape(carto.new) +
                tm_polygons(col="ID.group") +
                tm_shape(carto.partition) +
                tm_borders(col="black", lwd=2) +
                tm_layout(legend.outside=TRUE)
}

## End(Not run)


bigDM documentation built on Sept. 11, 2024, 9:05 p.m.