dsm_projection: Reduce Dimensionality of DSM by Subspace Projection...
In wordspace: Distributional Semantic Models in R

dsm.projection

R Documentation

Reduce Dimensionality of DSM by Subspace Projection (wordspace)

Description

Reduce dimensionality of DSM by linear projection of row vectors into a lower-dimensional subspace. Various projections methods with different properties are available.

Usage


dsm.projection(model, n,
               method = c("svd", "rsvd", "asvd", "ri", "ri+svd"),
               oversampling = NA, q = 2, rate = .01, power=1,
               with.basis = FALSE, verbose = FALSE)

Arguments

`model`	either an object of class `dsm`, or a dense or sparse numeric matrix
`method`	projection method to use for dimensionality reduction (see “DETAILS” below)
`n`	an integer specifying the number of target dimensions. Use `n=NA` to generate as many latent dimensions as possible (i.e. the minimum of the number of rows and columns of the DSM matrix).
`oversampling`	oversampling factor for stochastic dimensionality reduction algorithms (`rsvd`, `asvd`, `ri+svd`). If unspecified, the default value is 2 for `rsvd`, 10 for `asvd` and 10 for `ri+svd` (subject to change).
`q`	number of power iterations in the randomized SVD algorithm (Halko et al. 2009 recommend `q=1` or `q=2`)
`rate`	fill rate of random projection vectors. Each random dimension has on average `rate * ncol(model)` nonzero components in the original space
`power`	apply power scaling after SVD-based projection, i.e. multiply each latent dimension with a suitable power of the corresponding singular value. The default `power=1` corresponds to a regular orthogonal projection. For power > 1, the first SVD dimensions – i.e. those capturing the main patterns of M – are given more weight; for power < 1, they are given less weight. The setting `power=0` results in a full equalization of the dimensions and is also known as “whitening” in the PCA case.
`with.basis`	if `TRUE`, also returns orthogonal basis of the subspace as attribute of the reduced matrix (not available for random indexing methods)
`verbose`	if `TRUE`, some methods display progress messages during execution

Details

The following dimensionality reduction algorithms can be selected with the method argument:

svd: singular value decomposition (SVD), using the efficient SVDLIBC algorithm (Berry 1992) from package sparsesvd if the input is a sparse matrix. If the DSM has been scored with scale="center", this method is equivalent to principal component analysis (PCA).
rsvd: randomized SVD (Halko et al. 2009, p. 9) based on a factorization of rank oversampling * n with q power iterations.
asvd: approximate SVD, which determines latent dimensions from a random sample of matrix rows including oversampling * n data points. This heuristic algorithm is highly inaccurate and has been deprecated.
ri: random indexing (RI), i.e. a projection onto random basis vectors that are approximately orthogonal. Basis vectors are generated by setting a proportion of rate elements randomly to +1 or -1. Note that this does not correspond to a proper orthogonal projection, so the resulting coordinates in the reduced space should be used with caution.
ri+svd: RI to oversampling * n dimensions, followed by SVD of the pre-reduced matrix to the final n dimensions. This is not a proper orthogonal projection because the RI basis vectors in the first step are only approximately orthogonal.

Value

A numeric matrix with n columns (latent dimensions) and the same number of rows as the original DSM. Some SVD-based algorithms may discard poorly conditioned singular values, returning fewer than n columns.

If with.basis=TRUE and an orthogonal projection is used, the corresponding orthogonal basis B of the latent subspace is returned as an attribute "basis". B is column-orthogonal, hence B^T projects into latent coordinates and B B^T is an orthogonal subspace projection in the original coordinate system.

For orthogonal projections, the attribute "R2" contains a numeric vector specifying the proportion of the squared Frobenius norm of the original matrix captured by each of the latent dimensions. If the original matrix has been centered (so that a SVD projection is equivalent to PCA), this corresponds to the proportion of variance “explained” by each dimension.

For SVD-based projections, the attribute "sigma" contains the singular values corresponding to latent dimensions. It can be used to adjust the power scaling exponent at a later time.

Author(s)

Stephanie Evert (https://purl.org/stephanie.evert)

References

Berry, Michael~W. (1992). Large scale sparse singular value computations. International Journal of Supercomputer Applications, 6, 13–49.

Halko, N., Martinsson, P. G., and Tropp, J. A. (2009). Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions. Technical Report 2009-05, ACM, California Institute of Technology.

Examples


# 240 English nouns in space with correlated dimensions "own", "buy" and "sell"
M <- DSM_GoodsMatrix[, 1:3]

# SVD projection into 2 latent dimensions
S <- dsm.projection(M, 2, with.basis=TRUE)
  
100 * attr(S, "R2") # dim 1 captures 86.4% of distances
round(attr(S, "basis"), 3) # dim 1 = commodity, dim 2 = owning vs. buying/selling
  
S[c("time", "goods", "house"), ] # some latent coordinates
  
## Not run: 
idx <- DSM_GoodsMatrix[, 4] > .85 # only show nouns on "fringe"
plot(S[idx, ], pch=20, col="red", xlab="commodity", ylab="own vs. buy/sell")
text(S[idx, ], rownames(S)[idx], pos=3)

## End(Not run)

wordspace documentation built on Aug. 23, 2022, 1:06 a.m.