silhouette_plot: Return the silhouette index for clustered peaks

Description Usage Arguments Details Value Author(s) References Examples

View source: R/silhouette_plot.R

Description

It computes the silhouette index for peaks stored in a GRanges object and classified with the cluster_peak method. If the two classifications with a and without alignment are provided, this method computes the index for both these classifications.

Usage

1
2
3
silhouette_plot(object, p = 1, 
        weight = NULL, alpha = 1, 
        rescale = FALSE, t.max = 0.5)

Arguments

object

GRanges object. It must contain the metadata columns associated to the classification to be analyzed. Specifically it must contain the cluster_NOshift metadata if the user wants to compute the silhouette index for the non aligned peaks and/or the cluster_shift metadata if the user wants to compute the index for the classification with alignment.

p

integer value in {0, 1 , 2}. Order of the L^p distance used. In particular p = 0 stands for the L^{∞} distance, p = 1 for L^1 and p = 2 for L^2. Default is 1.

weight

real. Weight w of the distance function (see Details for the definition of the distance function), needed to make the distance between splines and derivatives comparable. It has no Default since it must be the same weight used to define the distance for the classification.

alpha

real value between 0 and 1. Value of the convex weight α of the distance to balance the distance between data and derivatives. See details for the definition. Default is 1.

t.max

real value. It tunes the maximum shift allowed. In particular the maximum shift at each iteration is computed as

max_shift = t.max * range(object)

and the optimum registration coefficient will be identified between - max_shift and + max_shift. range(object) is the maximum amplitude of the peaks. Default is 0.5.

rescale

logical. If TRUE clustering is performed on scaled peaks. For the definition of scaled peaks see smooth_peak.

Details

See [Rousseeuw, 1987] for the detailed definition of the index. Specifically, for the peak i it is computed as

s(i) = \frac{a(i)-b(i)}{\max(a(i), b(i))}

with a(i) the average dissimilarity of peak i with all other data within the same cluster and b(i) the lowest average dissimilarity of i to any other cluster, of which i is not a member.

Value

The function returns

Author(s)

Alice Parodi, Marco J. Morelli, Laura M. Sangalli, Piercesare Secchi, Simone Vantini

References

Peter J. Rousseeuw (1987). Silhouettes: a Graphical Aid to the Interpretation and Validation of Cluster Analysis. Computational and Applied Mathematics. 20: 53???65.

Examples

1
2
3
4
5
6
7
# load the data
data(peaks)

# computes the silhouette index and 
# shows the graph
sil <- silhouette_plot(peaks.data.cluster, p=2, weight = 1, alpha = 1,
                         rescale = FALSE, t.max = 2)

FunChIP documentation built on Nov. 8, 2020, 4:50 p.m.