mapper.sta | R Documentation |
This function is adopted from mapper
function of TDAmapper
with
different clustering methods (mainly k-means).
mapper.sta( dat, filter_values, num_intervals, percent_overlap, dist_method = "euclidean", cluster_method = "kmeans", NbClust_cluster_method = "kmeans", num_bins_when_clustering = 10, cluster_index = "all", n_class = 0, eps = 0.15, minPts = 5, permute_interval_level = FALSE, ... )
dat |
Matrix or dataset where rows are data points and columns are predictive variables. |
filter_values |
A n x m data frame of real numbers returned by the filter functions. |
dist_method |
The distance measure to be used to compute the
dissimilarity matrix. By default, distance="euclidean". It must be one of
This must be one of: "euclidean", "maximum", "manhattan", "canberra",
"binary", "minkowski" or "NULL". Details can be found in
|
cluster_method |
Clustering method. This should be one of: "hierarchical", "kmeans", "dbscan", "hdbscan". |
NbClust_cluster_method |
The cluster analysis method to be used. This
should be one of: "ward.D", "ward.D2", "single", "complete", "average",
"mcquitty", "median", "centroid", "kmeans".Details can be found in
|
num_bins_when_clustering |
For hierachical clustering. A positive integer that controls whether points in the same level set end up in the same cluster. |
cluster_index |
The index to be calculated. This should be one of :
"kl", "ch", "hartigan", "ccc", "scott", "marriot", "trcovw", "tracew",
"friedman", "rubin", "cindex", "db", "silhouette", "duda", "pseudot2",
"beale", "ratkowsky", "ball", "ptbiserial", "gap", "frey", "mcclain",
"gamma", "gplus", "tau", "dunn", "hubert", "sdindex", "dindex", "sdbw",
"all" (all indices except GAP, Gamma, Gplus and Tau), "alllong" (all
indices with Gap, Gamma, Gplus and Tau included). Details can be found in
|
n_class |
number of clusters for k means. By default, n_class=0. If
n_class>0, this function will instead call |
eps |
for DBSCAN, size of the epsilon neighborhood |
minPts |
for DBSCAN and HDBSCAN, number of minimum points in the eps region for core points. Default is 2 points |
permute_interval_level |
boolean. True if samples within each interval are to be permuted |
... |
Further arguments for |
This function is adopted from mapper
function of TDAmapper
by
replacing its cluster method with the cluster function
NbClust
from R package NbClust
.
The advantage of NbClust
is that it provides users with 8 different
cluster methods, 6 different distance measures and 30 indices for determining
the number of clusters. This allows users to select the best clustering
scheme from the different results obtained by varying all combinations of
number of clusters, distance measures, and clustering methods. Details of the
distance measures, clustering methods and cluster indices can be found in
NbClust
.
An object of class TDAmapper
which is a list of items named
adjacency
(adjacency matrix for the edges), num_vertices
(integer number of vertices), level_of_vertex
(vector with
level_of_vertex[i]
= index of the level set for vertex i),
points_in_vertex
(list with points_in_vertex[[i]]
= vector of
indices of points in vertex i), points_in_level
(list with
points_in_level[[i]]
= vector of indices of points in level set i,
and vertices_in_level
(list with vertices_in_level[[i]]
=
vector of indices of vertices in level set i.
Malika Charrad, Nadia Ghazzali, Veronique Boiteau, Azam Niknafs (2014). NbClust: An R Package for Determining the Relevant Number of Clusters in a Data Set. Journal of Statistical Software, 61(6), 1-36. URL http://www.jstatsoft.org/v61/i06/.
tp_data = chicken_generator(1) tp_data_mapper = mapper.sta(dat = tp_data[,2:4], filter_values = tp_data$Y, num_intervals = 10, percent_overlap = 70)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.