as_dsm_tm: Create DSM Object From 'tm' Package (wordspace)

as.dsm.tmR Documentation

Create DSM Object From tm Package (wordspace)

Description

Convert a tm term-document or document-term matrix into a wordspace DSM object.

Usage


## S3 method for class 'TermDocumentMatrix'
as.dsm(obj, ..., verbose=FALSE)
## S3 method for class 'DocumentTermMatrix'
as.dsm(obj, ..., verbose=FALSE)

Arguments

obj

an term-document or document-term matrix from the tm package, i.e. an object of a class TermDocumentMatrix or DocumentTermMatrix.

...

additional arguments are ignored

verbose

if TRUE, a few progress and information messages are shown

Value

An object of class dsm.

Author(s)

Stephanie Evert (https://purl.org/stephanie.evert)

See Also

as.dsm and the documentation of the tm package

Examples


## Not run: 
library(tm) # tm package needs to be installed
data(crude) # news messages on crude oil from Reuters corpus

cat(as.character(crude[[1]]), "\n") # a text example

corpus <- tm_map(crude, stripWhitespace) # some pre-processing
corpus <- tm_map(corpus, content_transformer(tolower))
corpus <- tm_map(corpus, removePunctuation)
corpus <- tm_map(corpus, removeWords, stopwords("english"))

cat(as.character(corpus[[1]]), "\n") # pre-processed text

dtm <- DocumentTermMatrix(corpus) # document-term matrix
inspect(dtm[1:5, 90:99])   # rows = documents

wordspace_dtm <- as.dsm(dtm, verbose=TRUE) # convert to DSM
print(wordspace_dtm$S[1:5, 90:99]) # same part of dtm as above

wordspace_tdm <- t(wordspace_dtm) # convert to term-document matrix
print(wordspace_tdm)

## End(Not run)


wordspace documentation built on Aug. 23, 2022, 1:06 a.m.