# dimRed: Dimensionality Reduction for sparse matrices, based on... In cysouw/qlcMatrix: Utility Sparse Matrix Functions for Quantitative Language Comparison

## Description

To inspect the structure of a large sparse matrix, it is often highly useful to reduce the matrix to a few major dimensions (cf. multidimensional scaling). This functions implements a rough approach to provide a few major dimensions. The function provides a simple wrapper around `Cholesky` and `sparsesvd`.

## Usage

 `1` ```dimRed(sim, k = 2, method = "svd") ```

## Arguments

 `sim` Sparse, symmetric, positive-definite matrix (typically a similarity matrix produces by `sim` or `assoc` functions) `k` Number of dimensions to be returned, defaults to two. `method` Method used for the decomposition. Currently implemted are `svd` and `cholesky`.

## Details

Based on the Cholesky decomposition, the Matrix `sim` is decomposed into:

L D L'

The D Matrix is a diagonal matrix, the values of which are returned here as `\$D`. Only the first few columns of the L Matrix are returned (possibly after permutation, see the details at `Cholesky`).

Based on the svd decomposition, the Matrix `sim` is decomposed into:

U D V

The U Matrix and the values from D are returned.

## Value

A list of two elements is returned:

 `L ` : a sparse matrix of type `dgCMatrix` with `k` columns `D ` : the diagional values from the Cholesky decomposition, or the eigenvalues from the svd decomposition

## Author(s)

Michael Cysouw <[email protected]>

See Also as `Cholesky` and `sparsesvd`

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49``` ```# some random points in two dimensions coor <- cbind(sample(1:30), sample(1:30)) # using cmdscale() to reconstruct the coordinates from a distance matrix d <- dist(coor) mds <- cmdscale(d) # using dimRed() on a similarity matrix. # Note that normL works much better than other norms in this 2-dimensional case s <- cosSparse(t(coor), norm = normL) red <- as.matrix(dimRed(s)\$L) # show the different point clouds par(mfrow = c(1,3)) plot(coor, type = "n", axes = FALSE, xlab = "", ylab = "") text(coor, labels = 1:30) title("Original coordinates") plot(mds, type = "n", axes = FALSE, xlab = "", ylab = "") text(mds, labels = 1:30) title("MDS from euclidean distances") plot(red, type = "n", axes = FALSE, xlab = "", ylab = "") text(red, labels = 1:30) title("dimRed from cosSparse similarity") par(mfrow = c(1,1)) # ====== # example, using the iris data data(iris) X <- t(as.matrix(iris[,1:4])) cols <- rainbow(3)[iris\$Species] s <- cosSparse(X, norm = norm1) d <- dist(t(X), method = "manhattan") svd <- as.matrix(dimRed(s, method = "svd")\$L) chol <- as.matrix(dimRed(s, method = "cholesky")\$L) mds <- cmdscale(d) par(mfrow = c(1,3)) plot(mds, col = cols, main = "cmdscale\nfrom euclidean distances") plot(svd, col = cols, main = "dimRed with svd\nfrom cosSparse with norm1") plot(chol, col = cols, main = "dimRed with cholesky\nfrom cosSparse with norm1") par(mfrow = c(1,1)) ```

cysouw/qlcMatrix documentation built on April 22, 2018, 4:59 a.m.