create_clusterball_mapper_object: ClusterBall Mapper

View source: R/baskin_robbins.R

create_clusterball_mapper_objectR Documentation

ClusterBall Mapper

Description

Run Ball Mapper, but non-trivially cluster within the balls. You can use two different distance matrices to for the balling and clustering.

Usage

create_clusterball_mapper_object(
  data,
  dist1,
  dist2,
  eps,
  clusterer = local_hierarchical_clusterer("single")
)

Arguments

data

A data frame.

dist1

A distance matrix for the data frame; this will be used to ball the data. It can be a dist object or a matrix.

dist2

Another distance matrix for the data frame; this will be used to cluster the data after balling. It can be a dist object or a matrix.

eps

A positive real number for the desired ball radius.

clusterer

A function which accepts a list of distance matrices as input, and returns the results of clustering done on each distance matrix; that is, it should return a list of named vectors, whose name are the names of data points and whose values are cluster assignments (integers). If this value is omitted, then single-linkage clustering will be done (and a cutting height will be decided for you).

Value

A list of two data frames, nodes and edges, which contain information about the Mapper graph constructed from the given parameters.

The node data frame consists of:

  • id: vertex ID

  • cluster_size: number of data points in vertex

  • mean_dist_to_medoid: mean distance to medoid of vertex

  • data: names of data points in cluster

  • patch: level set ID

The edge data frame contains consists of:

  • source: vertex ID of edge source

  • target: vertex ID of edge target

  • weight: Jaccard index of edge; this is the size of the intersection between the vertices divided by the union

  • overlap_data: names of data points in overlap

  • overlap_size: number of data points overlap

Examples

# Create noisy circle data set
data = data.frame(x = sapply(1:100, function(x) cos(x)), y = sapply(1:100, function(x) sin(x)))
data.dists = dist(data)

# Set ball radius
eps = 1

# Do single-linkage clustering in the balls to produce Mapper graph
create_clusterball_mapper_object(data, data.dists, data.dists, eps)

mappeR documentation built on June 29, 2025, 1:07 a.m.