dimRed: Dimensionality Reduction for sparse matrices, based on...
In cysouw/qlcMatrix: Utility Sparse Matrix Functions for Quantitative Language Comparison

dimRed

R Documentation

Dimensionality Reduction for sparse matrices, based on Cholesky decomposition

Description

To inspect the structure of a large sparse matrix, it is often highly useful to reduce the matrix to a few major dimensions (cf. multidimensional scaling). This functions implements a rough approach to provide a few major dimensions. The function provides a simple wrapper around Cholesky and sparsesvd.

Usage

dimRed(sim, k = 2, method = "svd")

Arguments

`sim`	Sparse, symmetric, positive-definite matrix (typically a similarity matrix produces by `sim` or `assoc` functions)
`k`	Number of dimensions to be returned, defaults to two.
`method`	Method used for the decomposition. Currently implemted are `svd` and `cholesky`.

Details

Based on the Cholesky decomposition, the Matrix sim is decomposed into:

L D L'

The D Matrix is a diagonal matrix, the values of which are returned here as $D. Only the first few columns of the L Matrix are returned (possibly after permutation, see the details at Cholesky).

Based on the svd decomposition, the Matrix sim is decomposed into:

U D V

The U Matrix and the values from D are returned.

Value

A list of two elements is returned:

`L`	: a sparse matrix of type `dgCMatrix` with `k` columns
`D`	: the diagional values from the Cholesky decomposition, or the eigenvalues from the svd decomposition

Author(s)

Michael Cysouw <cysouw@mac.com>

Examples

# some random points in two dimensions
coor <- cbind(sample(1:30), sample(1:30))

# using cmdscale() to reconstruct the coordinates from a distance matrix
d <- dist(coor)
mds <- cmdscale(d)

# using dimRed() on a similarity matrix.
# Note that normL works much better than other norms in this 2-dimensional case
s <- cosSparse(t(coor), norm = normL)
red <- as.matrix(dimRed(s)$L)

# show the different point clouds

oldpar<-par("mfrow")
par(mfrow = c(1,3))

  plot(coor, type = "n", axes = FALSE, xlab = "", ylab = "")
  text(coor, labels = 1:30)
  title("Original coordinates")
  
  plot(mds, type = "n", axes = FALSE, xlab = "", ylab = "")
  text(mds, labels = 1:30)
  title("MDS from euclidean distances")
  
  plot(red, type = "n", axes = FALSE, xlab = "", ylab = "")
  text(red, labels = 1:30)
  title("dimRed from cosSparse similarity")

par(mfrow = oldpar)

# ======

# example, using the iris data
data(iris)
X <- t(as.matrix(iris[,1:4]))
cols <- rainbow(3)[iris$Species]

s <- cosSparse(X, norm = norm1)
d <- dist(t(X), method = "manhattan")

svd <- as.matrix(dimRed(s, method = "svd")$L)
mds <- cmdscale(d)

oldpar<-par("mfrow")
par(mfrow = c(1,2))
  plot(mds, col = cols, main = "cmdscale\nfrom euclidean distances")
  plot(svd, col = cols, main = "dimRed with svd\nfrom cosSparse with norm1")
par(mfrow = oldpar)

cysouw/qlcMatrix documentation built on July 3, 2024, 8:44 p.m.