bn: Computes Bn Statistic.

Description Usage Arguments Details Value Examples

Description

Returns the value for the Bn statistic that measures the degree of separation between two groups. The statistic is computed through the difference of average within group distances to average between group distances. Large values of Bn indicate large group separation. Under overall sample homogeneity we have E(Bn)=0.

Usage

1
bn(group_id, md = NULL, data = NULL)

Arguments

group_id

A vector of 0s and 1s indicating to which group the samples belong. Must be in the same order as data or md.

md

Matrix of distances between all data points.

data

Data matrix. Each row represents an observation.

Details

Either data OR md should be provided. If data are entered directly, Bn will be computed considering the squared Euclidean distance, which is compatible with is_homo, uclust and uhclust.

For more detail see Cybis, Gabriela B., Marcio Valk, and Sílvia RC Lopes. "Clustering and classification problems in genetics through U-statistics." Journal of Statistical Computation and Simulation 88.10 (2018) and Valk, Marcio, and Gabriela Bettella Cybis. "U-statistical inference for hierarchical clustering." arXiv preprint arXiv:1805.12179 (2018).

Value

Value of the Bn statistic.

Examples

1
2
3
4
5
n=5
x=matrix(rnorm(n*10),ncol=10)
bn(c(1,0,0,0,0),data=x)     # option (a) entering the data matrix directly
md=as.matrix(dist(x))^2
bn(c(0,1,1,1,1),md)         # option (b) entering the distance matrix

gcybis/Uclust documentation built on May 8, 2019, 1:20 p.m.