LSAfun-package: Computations based on Latent Semantic Analysis

Description The focus of this package How to obtain a semantic space Author(s)

Description

Offers methods and functions for working with Vector Space Models of semantics, such as Latent Semantic Analysis (LSA). Such models are created by algorithms working on a corpus of text documents. Those algorithms achieve a high-dimensional vector representation for word (and document) meanings. The exact LSA algorithm is described in Martin & Berry (2007).
Such a representation allows for the computation of word (and document) similarities, for example by computing cosine values of angles between two vectors.

The focus of this package

This package is not designed to create LSA semantic spaces. In R, this functionality is provided by the package lsa. The focus of the package LSAfun2 is to provide functions to be applied on existing LSA (or other) semantic spaces, such as

  1. Similarity Computations

  2. Neighborhood Computations

  3. Applied Functions

  4. Composition Methods

How to obtain a semantic space

LSAfun2 comes with one example LSA space, the wonderland space.
This package can also directly use LSA semantic spaces created with the lsa-package. Thus, it allows the user to use own LSA spaces. (Note that the function lsa gives a list of three matrices. Of those, the term matrix U should be used.)

The lsa package works with (very) small corpora, but gets difficulties in scaling up to larger corpora. In this case, it is recommended to use specialized software for creating semantic spaces, such as

Another possibility is to use one of the LSA spaces provided at http://www.lingexp.uni-tuebingen.de/z2/LSAspaces. These are stored in the .rda format. To load one of these spaces into the R workspace, save them into a directory, set the working directory to that directory, and load the space using load().

Author(s)

Fritz G?nther


codymarquart/LSAfun2 documentation built on May 12, 2017, 8:43 a.m.