This data set is a 50-dimensional LSA space derived from Lewis Carrol's book "Alice's Adventures in Wonderland". The book was split into 791 paragraphs which served as documents for the LSA algorithm (Landauer, Foltz & Laham, 1998). Only words that appeared in at least two documents were used for building the LSA space.
This LSA space contains 1123 different terms, all in lower case letters, and was created using the
lsa-package. It can be used as
tvectors for all the functions in the
A 1123x50 matrix with terms as rownames.
Landauer, T., Foltz, P., and Laham, D. (1998) Introduction to Latent Semantic Analysis. In: Discourse Processes 25, pp. 259-284.
Carroll, L. (1865). Alice's Adventures in Wonderland. New York: MacMillan.