R/clustering_design.R

Defines functions clustering.design cluster.error

Documented in cluster.error clustering.design

#' @name
#' cluster.error
#'
#' @title
#' Clustering error
#'
#' @description
#' This function computes the clustering error.
#'
#' @details
#' \code{cluster.error} computes the clustering error. The clustering error for a design \eqn{D=[\bm x_1, \dots, \bm x_n]^T} is defined as \eqn{\frac{1}{N}\sum_{i=1}^n\sum_{\bm x\in{V_i}}\|\bm x - \bm x_i\|^\alpha}, where \eqn{V_i} is the Voronoi cell of each design point \eqn{\bm x_i} for \eqn{i=1,\dots,n}, N is the size of X. When \eqn{\alpha=2}, we obtain K-means and when \eqn{\alpha=1}, we obtain K-medians.
#'
#' @param design a design matrix.
#' @param X candidate points in \eqn{[0,1]^p}. If X is not provided, Sobol points are generated as candidate points.
#' @param alpha power of the Euclidean distance.
#'
#' @return
#' clustering error of the design.
#'
#' @examples
#' n = 20
#' p = 3
#' D = randomLHD(n, p)
#' cluster.error(D)
#'
cluster.error = function(design, X=NULL, alpha=1){
  p = ncol(design)
  if (is.null(X)){
    X = spacefillr::generate_sobol_set(1e5*p, p, sample(1e5, 1))
  }
  return (clusterError(design, X, alpha))
}
#'
#' @name
#' clustering.design
#'
#' @title
#' Designs generated by clustering algorithms
#'
#' @description
#' This function is for producing designs by minimizing the clustering error.
#'
#' @details
#' \code{clustering.design} produces a design by clustering algorithms. It minimize the clustering error (see \code{\link{cluster.error}}) by Lloyd's algorithm. When \eqn{\alpha > 2}, accelerated gradient descent is used to find the center for each cluster (Mak, S. and Joseph, V. R. 2018). When \eqn{\alpha \leq 2}: Weizfeld algorithm is used to find the center for each cluster. Let \eqn{\bm{x}^{(0)}_i=\bm{x}_i} denote the initial position of the \eqn{i}th center and and let \eqn{\bm{S}_i} represent the points within its Voronoi cell. The center is then updated as: \eqn{\bm{x}^{(k+1)}_i = \left.\left(\sum_{\bm{s}\in\bm{S}_i}\frac{\bm{s}}{\|\bm{s}-\bm{x}^{(k)}_i\|_2^{2-\alpha}}\right)\middle/ \left(\sum_{\bm{s}\in\bm{S}_i}\frac{1}{\|\bm{s}-\bm{x}^{(k)}_i\|_2^{2-\alpha}}\right)\right. \quad \text{for } k=0, 1, \dots.}
#'
#' @param n design size.
#' @param p design dimension.
#' @param X candidate points in \eqn{[0,1]^p}. If X is not provided, Sobol points is generated as cluster points.
#' @param D.ini initial design points. If D.ini is not provided, Sobol points are generated as initial design.
#' @param multi.start number of starting designs (cluster centers).
#' @param alpha power of the Euclidean distance.
#' @param Lloyd.iter.max maximum number of iterations for the Lloyd algorithm.
#' @param cen.iter.max maximum number of iterations for the center calculation for each cluster.
#' @param Lloyd.tol minimum relative change for the Lloyd algorithm to continue.
#' @param cen.tol minimum relative change for the center calculation algorithm to continue.
#'
#' @return
#' \item{design}{final design points.}
#' \item{cluster}{cluster assignment for each cluster points.}
#' \item{cluster.error}{final cluster error.}
#'
#' @references
#' Mak, S. and Joseph, V. R. (2018), “Minimax and minimax projection designs using clustering,” Journal of Computational and Graphical Statistics, 27, 166–178.
#'
#' @examples
#' n = 20
#' p = 3
#' X = spacefillr::generate_sobol_set(1e5*p, p)
#' D = clustering.design(n, p, X)
#'
clustering.design = function(n, p, X=NULL, D.ini=NULL, multi.start=1, alpha=1.0,
                       Lloyd.iter.max=100, cen.iter.max=10,
                       Lloyd.tol=1e-4, cen.tol=1e-4){
  if (is.null(X)){
    X = spacefillr::generate_sobol_set(1e5*p, p, sample(1e5, 1))
  }

  if (multi.start==1){
    if (is.null(D.ini)){
      D.ini = pmax(pmin(jitter(spacefillr::generate_sobol_set(n, p, sample(1e5, 1))), 1), 0)
    }
    result = cluster_based_design_cpp(X, D.ini, alpha=alpha,
                                      Lloyd_iter_max=Lloyd.iter.max, Lloyd_tol=Lloyd.tol,
                                      cen_iter_max=cen.iter.max, cen_tol=cen.tol)

    return (list(design=result$design, cluster=result$cluster,
                 cluster.error=result$cluster_error, total.iter=result$total_iter))
  }else{
    obj_best = .Machine$double.xmax
    random_seed = sample(1e5, multi.start)
    for (i in 1:multi.start){
      D.ini = pmax(pmin(jitter(spacefillr::generate_sobol_set(n, p, random_seed[i])), 1), 0)
      result = cluster_based_design_cpp(X, D.ini, alpha=alpha,
                                        Lloyd_iter_max=Lloyd.iter.max, Lloyd_tol=Lloyd.tol,
                                        cen_iter_max=cen.iter.max, cen_tol=cen.tol)

      obj = result$cluster_error
      if (obj < obj_best){
        obj_best = obj
        design = result$design
        cluster = result$cluster
        total.iter = result$total_iter
      }
    }
    return (list(design=design, cluster=cluster,
                 cluster.error=obj, total.iter=total.iter))
  }
}

Try the SFDesign package in your browser

Any scripts or data that you put into this service are public.

SFDesign documentation built on June 22, 2025, 1:06 a.m.