compute.integral: Functions to compute the integral of the ecdf of the...
In mosclust: Model Order Selection for Clustering

compute.integral

R Documentation

Functions to compute the integral of the ecdf of the similarity values

Description

The function compute.integral computes the integral of the ecdf form the function of class ecdf that stores the discrete values of the empirical cumulative distribution, while the function compute.integral.from.similarity computes the integral of the ecdf exploiting then empirical mean of the similarity values (see the paper cited in the reference section for details).

Usage

compute.integral(Fun, subdivisions = 1000)

compute.integral.from.similarity(sim.matrix)

Arguments

`Fun`	Function of class ecdf that stores the discrete values of the empirical cumulative distribution
`subdivisions`	maximum number of subintervals used by the integration process
`sim.matrix`	a matrix that stores the similarity between pairs of clustering across multiple number of clusters and multiple clusterings performed on subsamples of the original data. Number or rows equal to the different numbers of clusters considered; number of columns equal to the number of subsamples considered for each number of clusters.

Value

The function compute.integral returns the value of the estimate integral.

The function compute.integral.from.similarity returns a vector of the values of the estimate integrals (one for each row of sim.matrix).

Author(s)

Giorgio Valentini valentini@di.unimi.it

References

A.Bertoni, G. Valentini, Discovering significant structures in clustered data through Bernstein inequality, CISI '06, Conferenza Italiana Sistemi Intelligenti, Ancona, Italia, 2006.

Examples

library("clusterv")
# Data set generation
M <- generate.sample6 (n=20, m=10, dim=1000, d=3, s=0.2);
# generation of multiple similarity measures by resampling
Sr.kmeans.sample6 <- do.similarity.resampling(M, c=10, nsub=20, f=0.8, s=sFM, 
                                      alg.clust.sim=Kmeans.sim.resampling); 
# computation of multiple ecdf (from 2 to 10 clusters)
list.F <- compute.cumulative.multiple(Sr.kmeans.sample6);
# computation of the integral of the ecdf with 2 clusters
compute.integral(list.F[[1]])
# computation of the integral of the ecdf with 8 clusters
compute.integral(list.F[[7]])
# computation of the integral of the ecdfs from 2 to 10 clusters
compute.integral.from.similarity(Sr.kmeans.sample6)

mosclust documentation built on June 8, 2025, 11:23 a.m.