LSA Space: Alice's Adventures in Wonderland


This data set is a 50-dimensional LSA space derived from Lewis Carrol's book "Alice's Adventures in Wonderland". The book was split into 791 paragraphs which served as documents for the LSA algorithm (Landauer, Foltz & Laham, 1998). Only words that appeared in at least two documents were used for building the LSA space.
This LSA space contains 1123 different terms, all in lower case letters, and was created using the lsa-package. It can be used as tvectors for all the functions in the LSAfun-package.




A 1123x50 matrix with terms as rownames.


Alice in Wonderland from Project Gutenberg


Landauer, T., Foltz, P., and Laham, D. (1998) Introduction to Latent Semantic Analysis. In: Discourse Processes 25, pp. 259-284.

Carroll, L. (1865). Alice's Adventures in Wonderland. New York: MacMillan.

Questions? Problems? Suggestions? or email at

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.