Description Usage Arguments Value
This function runs hierarchical linkage using one of five linkage methods: single linkage, complete linkage, average linkage, centroid linkage and minimax linkage. For a data set with $n$ items, it is possible to get clusterings of sizes 1 through n. For each cluster size, we compute four evaluation metrics: 1. maximum minimax radius (see Bien et al. 2011), 2. misclassification rate, 3. precision, 4. recall.
1 2 3 4 5 6 7 8 | getMetrics(
allPairwise,
pairColNums,
matchColNum,
distSimCol,
linkage,
myDist = TRUE
)
|
allPairwise |
name of data frame containing all pairwise comparisons. This needs to have at least four columns, one representing the first item in the comparison, one representing the second item, one representing the true match/non-match status, and the last representing a distance or similarity metric. These are enumerated in the next three parameters. |
pairColNums |
vector of length 2 indicating the column numbers in 'allPairwise' of 1. item 1 in comparison, 2. item 2 in comparison |
matchColNum |
column number of column in 'allPairwise' indicating true match/non-match status |
distSimCol |
name of column in 'allPairwise' indicating distances or similarities, input as character, e.g. "l2dist". If this is a similarity and not a difference, input 'myDist' parameter to be FALSE. If a similarity measure is used, distance will be calcualted as 1 - similarity. |
linkage |
one of "single", "complete", "average", "centroid", "minimax" |
myDist |
is 'distSimCol' a distance or similarity measure? Default TRUE, i.e. distance measure |
outMetrics, a data frame with each row representing a clustering. For a data set with $n$ items, there will be $n$ rows. Columns are the four evaluation metrics, 'maxMinimax', 'misClass', 'precision' and 'recall'.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.