| interval_distance | R Documentation |
Functions to compute various distance measures between interval-valued observations.
int_dist_all computes all available distance measures at once.
int_dist(x, method = "euclidean", gamma = 0.5, q = 1, p = 2, ...)
int_dist_matrix(x, method = "euclidean", gamma = 0.5, q = 1, p = 2, ...)
int_pairwise_dist(x, var_name1, var_name2, method = "euclidean", ...)
int_dist_all(x, gamma = 0.5, q = 1)
x |
interval-valued data with symbolic_tbl class, or an array of dimension [n, p, 2] |
method |
distance method: "GD", "IY", "L1", "L2", "CB", "HD", "EHD", "nEHD", "snEHD", "TD", "WD", "euclidean", "hausdorff", "manhattan", "city_block", "minkowski", "wasserstein", "ichino", "de_carvalho" |
gamma |
parameter for the Ichino-Yaguchi distance, 0 <= gamma <= 0.5 (default: 0.5) |
q |
parameter for the Ichino-Yaguchi distance (Minkowski exponent) (default: 1) |
p |
power parameter for Minkowski distance (default: 2) |
... |
additional parameters |
var_name1 |
first variable name or column location |
var_name2 |
second variable name or column location |
Available distance methods:
GD: Gowda-Diday distance (Gowda & Diday, 1991)
IY: Ichino-Yaguchi distance (Ichino, 1988)
L1: L1 (midpoint Manhattan) distance
L2: L2 (Euclidean midpoint) distance
CB: City-Block distance (Souza & de Carvalho, 2004)
HD: Hausdorff distance (Chavent & Lechevallier, 2002)
EHD: Euclidean Hausdorff distance
nEHD: Normalized Euclidean Hausdorff distance
snEHD: Span Normalized Euclidean Hausdorff distance
TD: Tran-Duckstein distance (Tran & Duckstein, 2002)
WD: L2-Wasserstein distance (Verde & Irpino, 2008)
euclidean: Euclidean distance on interval centers (same as L2)
hausdorff: Hausdorff distance (same as HD)
manhattan: Manhattan distance (same as L1)
city_block: City-block distance (same as CB)
minkowski: Minkowski distance with parameter p
wasserstein: Wasserstein distance (same as WD)
ichino: Ichino-Yaguchi distance (simplified version)
de_carvalho: De Carvalho distance
A distance matrix (class 'dist') or numeric vector
Han-Ming Wu
Gowda, K. C., & Diday, E. (1991). Symbolic clustering using a new dissimilarity measure. Pattern Recognition, 24(6), 567-578.
Ichino, M. (1988). General metrics for mixed features. Systems and Computers in Japan, 19(2), 37-50.
Chavent, M., & Lechevallier, Y. (2002). Dynamical clustering of interval data. In Classification, Clustering and Data Analysis (pp. 53-60). Springer.
Tran, L., & Duckstein, L. (2002). Comparison of fuzzy numbers using a fuzzy distance measure. Fuzzy Sets and Systems, 130, 331-341.
Verde, R., & Irpino, A. (2008). A new interval data distance based on the Wasserstein metric.
Kao, C.-H. et al. (2014). Exploratory data analysis of interval-valued symbolic data with matrix visualization. Computational Statistics & Data Analysis, 79, 14–29.
int_dist_matrix int_dist_all int_pairwise_dist
# Using symbolic_tbl format
data(mushroom.int)
d1 <- int_dist(mushroom.int[, 3:4], method = "euclidean")
d2 <- int_dist(mushroom.int[, 3:4], method = "hausdorff")
d3 <- int_dist(mushroom.int[, 3:4], method = "GD")
# Using array format: 4 concepts, 3 variables
x <- array(NA, dim = c(4, 3, 2))
x[,,1] <- matrix(c(1,2,3,4, 5,6,7,8, 9,10,11,12), nrow=4)
x[,,2] <- matrix(c(3,5,6,7, 8,9,10,12, 13,15,16,18), nrow=4)
d4 <- int_dist(x, method = "snEHD")
d5 <- int_dist(x, method = "IY", gamma = 0.3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.