make_clustering_templates: Make Clustering Templates

View source: R/ClusterMakeTemplates.R

make_clustering_templatesR Documentation

Make Clustering Templates

Description

make_clustering_templates() applies a K-means clustering algorithm to the input handwriting samples pre-processed with process_batch_dir() and saved in the input folder ⁠template_dir > data > template_graphs⁠. The K-means algorithm sorts the graphs in the input handwriting samples into groups, or clusters, of similar graphs.

Usage

make_clustering_templates(
  template_dir,
  template_images_dir,
  writer_indices,
  max_edges = 30,
  centers_seed,
  graphs_seed,
  K = 40,
  num_dist_cores = 1,
  num_path_cuts = 8,
  max_iters = 1,
  gamma = 3,
  num_graphs = "All"
)

Arguments

template_dir

Main directory that will store template files

template_images_dir

A directory containing template training images

writer_indices

A vector of the starting and ending location of the writer ID in the file name.

max_edges

Maximum number of edges allowed in input graphs. Graphs with more than the maximum number will be ignored.

centers_seed

Integer seed for the random number generator when selecting starting cluster centers.

graphs_seed

Integer seed for the random number generator when selecting graphs. If num_graphs = 'All' then graphs_seed won't be used

K

Integer number of clusters

num_dist_cores

Integer number of cores to use for the distance calculations in the K-means algorithm. Each iteration of the K-means algorithm calculates the distance between each input graph and each cluster center.

num_path_cuts

Integer number of sections to cut each graph into for shape comparison

max_iters

Maximum number of iterations to allow the K-means algorithm to run

gamma

Parameter for outliers

num_graphs

Number of graphs to use to create the cluster template. All uses all available graphs. An integer uses a random sample of graphs.

Value

List containing the cluster template

Examples

## Not run: 
main_dir <- "path/to/folder"
template_images_dir <- system.file("extdata/example_images/template_training_images",
  package = "handwriter"
)
template_list <- make_clustering_templates(
  template_dir = main_dir,
  template_images_dir = template_images_dir,
  writer_indices = c(2, 5),
  K = 10,
  num_dist_cores = 2,
  max_iters = 3,
  num_graphs = 1000,
  centers_seed = 100,
  graphs_seed = 200
)

## End(Not run)


handwriter documentation built on Oct. 13, 2023, 5:10 p.m.