# dca: Discriminative Component Analysis In dml: Distance Metric Learning in R

## Description

Performs discriminative component analysis on the given data.

## Usage

 `1` ```dca(data, chunks, neglinks, useD = NULL) ```

## Arguments

 `data` `n * d` data matrix. `n` is the number of data points, `d` is the dimension of the data. Each data point is a row in the matrix. `chunks` length `n` vector describing the chunklets: `-1` in the `i` th place means point `i` doesn't belong to any chunklet; integer `j` in place `i` means point `i` belongs to chunklet j. The chunklets indexes should be 1:(number of chunklets). `neglinks` `s * s` symmetric matrix describing the negative relationship between all the `s` chunklets. For the element neglinks_{ij}: neglinks_{ij} = 1 means chunklet `i` and chunklet j have negative constraint(s); neglinks_{ij} = 0 means chunklet `i` and chunklet j don't have negative constraints or we don't have information about that. `useD` Integer. Optional. When not given, DCA is done in the original dimension and B is full rank. When useD is given, DCA is preceded by constraints based LDA which reduces the dimension to useD. B in this case is of rank useD.

## Details

Put DCA function details here.

## Value

list of the DCA results:

 `B` DCA suggested Mahalanobis matrix `DCA` DCA suggested transformation of the data. The dimension is (original data dimension) * (useD) `newData` DCA transformed data

For every two original data points (x1, x2) in newData (y1, y2):

(x2 - x1)' * B * (x2 - x1) = || (x2 - x1) * A ||^2 = || y2 - y1 ||^2

## Note

Put some note here.

## Author(s)

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48``` ```## Not run: set.seed(123) require(MASS) # generate synthetic Gaussian data k = 100 # sample size of each class n = 3 # specify how many class N = k * n # total sample number x1 = mvrnorm(k, mu = c(-10, 6), matrix(c(10, 4, 4, 10), ncol = 2)) x2 = mvrnorm(k, mu = c(0, 0), matrix(c(10, 4, 4, 10), ncol = 2)) x3 = mvrnorm(k, mu = c(10, -6), matrix(c(10, 4, 4, 10), ncol = 2)) data = as.data.frame(rbind(x1, x2, x3)) # The fully labeled data set with 3 classes plot(data\$V1, data\$V2, bg = c("#E41A1C", "#377EB8", "#4DAF4A")[gl(n, k)], pch = c(rep(22, k), rep(21, k), rep(25, k))) Sys.sleep(3) # Same data unlabeled; clearly the classes' structure is less evident plot(x\$V1, x\$V2) Sys.sleep(3) chunk1 = sample(1:100, 5) chunk2 = sample(setdiff(1:100, chunk1), 5) chunk3 = sample(101:200, 5) chunk4 = sample(setdiff(101:200, chunk3), 5) chunk5 = sample(201:300, 5) chks = list(chunk1, chunk2, chunk3, chunk4, chunk5) chunks = rep(-1, 300) # positive samples in the chunks for (i in 1:5) { for (j in chks[[i]]) { chunks[j] = i } } # define the negative constrains between chunks neglinks = matrix(c( 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0), ncol = 5, byrow = TRUE) dcaData = dca(data = data, chunks = chunks, neglinks = neglinks)\$newData # plot DCA transformed data plot(dcaData[, 1], dcaData[, 2], bg = c("#E41A1C", "#377EB8", "#4DAF4A")[gl(n, k)], pch = c(rep(22, k), rep(21, k), rep(25, k)), xlim = c(-15, 15), ylim = c(-15, 15)) ## End(Not run) ```