GdmDiag: Global Distance Metric Learning

Description Usage Arguments Details Value Note Author(s) References Examples

Description

Performs Global Distance Metric Learning (GDM) on the given data, learning a diagonal matrix.

Usage

1
2
GdmDiag(data, simi, dism, C0 = 1, S1 = NULL, D1 = NULL,
  threshold = 0.001)

Arguments

data

n * d data matrix. n is the number of data points, d is the dimension of the data. Each data point is a row in the matrix.

simi

n * 2 matrix describing the similar constrains. Each row of matrix is serial number of a similar pair in the original data. For example, pair(1, 3) represents the first observation is similar the 3th observation in the original data.

dism

n * 2 matrix describing the dissimilar constrains as simi. Each row of matrix is serial number of a dissimilar pair in the original data.

C0

numeric, the bound of similar constrains.

threshold

numeric, the threshold of stoping the learning iteration.

Details

Put GdmDiag function details here.

Value

list of the GdmDiag results:

newData

GdmDiag transformed data

diagonalA

suggested Mahalanobis matrix

dmlA

matrix to transform data, square root of diagonalA

error

the precision of obtained distance metric by Newton-Raphson optimization

For every two original data points (x1, x2) in newData (y1, y2):

(x2 - x1)' * A * (x2 - x1) = || (x2 - x1) * B ||^2 = || y2 - y1 ||^2

Note

Be sure to check whether the dimension of original data and constrains' format are valid for the function.

Author(s)

Tao Gao <http://www.gaotao.name>

References

Steven C.H. Hoi, W. Liu, M.R. Lyu and W.Y. Ma (2003). Distance metric learning, with application to clustering with side-information.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
set.seed(602)
library(MASS)
library(scatterplot3d)

# generate simulated Gaussian data
k = 100
m <- matrix(c(1, 0.5, 1, 0.5, 2, -1, 1, -1, 3), nrow =3, byrow = T)
x1 <- mvrnorm(k, mu = c(1, 1, 1), Sigma = m)
x2 <- mvrnorm(k, mu = c(-1, 0, 0), Sigma = m)
data <- rbind(x1, x2)

# define similar constrains
simi <- rbind(t(combn(1:k, 2)), t(combn((k+1):(2*k), 2)))

temp <-  as.data.frame(t(simi))
tol <- as.data.frame(combn(1:(2*k), 2))

# define disimilar constrains
dism <- t(as.matrix(tol[!tol %in% simi]))

# transform data using GdmDiag
result <- GdmDiag(data, simi, dism)
newData <- result$newData
# plot original data
color <- gl(2, k, labels = c("red", "blue"))
par(mfrow = c(2, 1), mar = rep(0, 4) + 0.1)
scatterplot3d(data, color = color, cex.symbols = 0.6,
		  xlim = range(data[, 1], newData[, 1]),
		  ylim = range(data[, 2], newData[, 2]),
		  zlim = range(data[, 3], newData[, 3]),
		  main = "Original Data")
# plot GdmDiag transformed data
scatterplot3d(newData, color = color, cex.symbols = 0.6,
		  xlim = range(data[, 1], newData[, 1]),
		  ylim = range(data[, 2], newData[, 2]),
		  zlim = range(data[, 3], newData[, 3]),
		  main = "Transformed Data")

road2stat/sdml documentation built on May 27, 2019, 10:31 a.m.