LSA Space: Alice's Adventures in Wonderland

Description

This data set is a 50-dimensional LSA space derived from Lewis Carrol's book "Alice's Adventures in Wonderland". The book was split into 791 paragraphs which served as documents for the LSA algorithm (Landauer, Foltz & Laham, 1998). Only words that appeared in at least two documents were used for building the LSA space.
This LSA space contains 1123 different terms, all in lower case letters, and was created using the lsa-package. It can be used as tvectors for all the functions in the LSAfun-package.

Usage

1
data(wonderland)

Format

A 1123x50 matrix with terms as rownames.

Source

Alice in Wonderland from Project Gutenberg

References

Landauer, T., Foltz, P., and Laham, D. (1998) Introduction to Latent Semantic Analysis. In: Discourse Processes 25, pp. 259-284.

Carroll, L. (1865). Alice's Adventures in Wonderland. New York: MacMillan.