as.textmatrix | R Documentation |
Returns a latent semantic space (created by createLSAspace) in textmatrix format: rows are terms, columns are documents.
as.textmatrix( LSAspace )
LSAspace |
a latent semantic space generated by createLSAspace. |
To allow comparisons between terms and documents, the internal
format of the latent semantic space needs to be converted to
a classical document-term matrix (just like the ones generated by
textmatrix()
that are of class ‘textmatrix’).
Remark: There are other ways to compare documents and terms using the partial matrices from an LSA space directly. See (Berry, 1995) for more information.
textmatrix |
a textmatrix representation of the latent semantic space. |
Fridolin Wild f.wild@open.ac.uk
Berry, M., Dumais, S., and O'Brien, G (1995) Using Linear Algebra for Intelligent Information Retrieval. In: SIAM Review, Vol. 37(4), pp.573–595.
textmatrix
, lsa
, fold_in
# create some files td = tempfile() dir.create(td) write( c("dog", "cat", "mouse"), file=paste(td, "D1", sep="/")) write( c("hamster", "mouse", "sushi"), file=paste(td, "D2", sep="/")) write( c("dog", "monster", "monster"), file=paste(td, "D3", sep="/")) write( c("dog", "mouse", "dog"), file=paste(td, "D4", sep="/")) # read files into a document-term matrix myMatrix = textmatrix(td, minWordLength=1) # create the latent semantic space myLSAspace = lsa(myMatrix, dims=dimcalc_raw()) # display it as a textmatrix again round(as.textmatrix(myLSAspace),2) # should give the original # create the latent semantic space myLSAspace = lsa(myMatrix, dims=dimcalc_share()) # display it as a textmatrix again myNewMatrix = as.textmatrix(myLSAspace) myNewMatrix # should look be different! # compare two terms with the cosine measure cosine(myNewMatrix["dog",], myNewMatrix["cat",]) # compare two documents with pearson cor(myNewMatrix[,1], myNewMatrix[,2], method="pearson") # clean up unlink(td, recursive=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.