dca: Discriminative Component Analysis

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Description

Performs discriminative component analysis on the given data.

Usage

1
dca(data, chunks, neglinks, useD = NULL)

Arguments

data

n * d data matrix. n is the number of data points, d is the dimension of the data. Each data point is a row in the matrix.

chunks

length n vector describing the chunklets: -1 in the i th place means point i doesn't belong to any chunklet; integer j in place i means point i belongs to chunklet j. The chunklets indexes should be 1:(number of chunklets).

neglinks

s * s symmetric matrix describing the negative relationship between all the s chunklets. For the element neglinks_{ij}: neglinks_{ij} = 1 means chunklet i and chunklet j have negative constraint(s); neglinks_{ij} = 0 means chunklet i and chunklet j don't have negative constraints or we don't have information about that.

useD

Integer. Optional. When not given, DCA is done in the original dimension and B is full rank. When useD is given, DCA is preceded by constraints based LDA which reduces the dimension to useD. B in this case is of rank useD.

Details

Put DCA function details here.

Value

list of the DCA results:

B

DCA suggested Mahalanobis matrix

DCA

DCA suggested transformation of the data. The dimension is (original data dimension) * (useD)

newData

DCA transformed data

For every two original data points (x1, x2) in newData (y1, y2):

(x2 - x1)' * B * (x2 - x1) = || (x2 - x1) * A ||^2 = || y2 - y1 ||^2

Note

Put some note here.

Author(s)

Nan Xiao <https://nanx.me>

References

Steven C.H. Hoi, W. Liu, M.R. Lyu and W.Y. Ma (2006). Learning Distance Metrics with Contextual Constraints for Image Retrieval. Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR2006).

See Also

See kdca for the kernelized version of DCA.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
set.seed(123)
require(MASS)  # generate synthetic Gaussian data
k = 100        # sample size of each class
n = 3          # specify how many class
N = k * n      # total sample number
x1 = mvrnorm(k, mu = c(-10, 6), matrix(c(10, 4, 4, 10), ncol = 2))
x2 = mvrnorm(k, mu = c(0, 0), matrix(c(10, 4, 4, 10), ncol = 2))
x3 = mvrnorm(k, mu = c(10, -6), matrix(c(10, 4, 4, 10), ncol = 2))
data = as.data.frame(rbind(x1, x2, x3))
# The fully labeled data set with 3 classes
plot(data$V1, data$V2, bg = c("#E41A1C", "#377EB8", "#4DAF4A")[gl(n, k)],
     pch = c(rep(22, k), rep(21, k), rep(25, k)))
Sys.sleep(3)
# Same data unlabeled; clearly the classes' structure is less evident
plot(x$V1, x$V2)
Sys.sleep(3)

chunk1 = sample(1:100, 5)
chunk2 = sample(setdiff(1:100, chunk1), 5)
chunk3 = sample(101:200, 5)
chunk4 = sample(setdiff(101:200, chunk3), 5)
chunk5 = sample(201:300, 5)
chks = list(chunk1, chunk2, chunk3, chunk4, chunk5)
chunks = rep(-1, 300)
# positive samples in the chunks
for (i in 1:5) {
  for (j in chks[[i]]) {
    chunks[j] = i
  }
}

# define the negative constrains between chunks
neglinks = matrix(c(
		0, 0, 1, 1, 1,
		0, 0, 1, 1, 1,
		1, 1, 0, 0, 0,
		1, 1, 0, 0, 1,
		1, 1, 1, 1, 0),
		ncol = 5, byrow = TRUE)

dcaData = dca(data = data, chunks = chunks, neglinks = neglinks)$newData
# plot DCA transformed data
plot(dcaData[, 1], dcaData[, 2], bg = c("#E41A1C", "#377EB8", "#4DAF4A")[gl(n, k)],
     pch = c(rep(22, k), rep(21, k), rep(25, k)),
     xlim = c(-15, 15), ylim = c(-15, 15))

road2stat/sdml documentation built on May 27, 2019, 10:31 a.m.