R/get_clusters.R

Defines functions get_clusters

Documented in get_clusters

#' Clustering groups returned as dataframe
#'
#' Input a distance matrix and returns a dataframe with two columns: area and clustering grouping, where a choice of clustering method is provided.
#'
#' @param dist_mat A distance matrix.
#' @param cluster_num Number of clusters.
#' @param method The agglomeration method that is passed to \code{\link[stats]{hclust}}. This can be chosen from the following: "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC).
#' @return A map upon which dialect areas are clustered.
#'
#' @return A dataframe with the two columns area and (clustering) grouping.
#' @export
#'
#' @examples
#' # Example 1:
#' data(distDutch)
#' get_clusters(distDutch, 5 ,"ward.D2")
get_clusters <- function(dist_mat, cluster_num, method){
  dist_mat[upper.tri(dist_mat)] <- NA
  dist_mat <- stats::as.dist(dist_mat)
    clustered_dist <- stats::hclust(dist_mat, method = method)
    cluster_groups <- tibble::rownames_to_column(
      as.data.frame(stats::cutree(clustered_dist, k = cluster_num)))
  colnames(cluster_groups) <- c("area", "grouping")
  cluster_groups
}

Try the dialectR package in your browser

Any scripts or data that you put into this service are public.

dialectR documentation built on May 20, 2021, 9:06 a.m.