corpus2dtm: From ISC corpus to a Document-Term Matrix
In paolofantini/Supreme: Make it easier applying LDA topic models to a corpus of Italian Supreme Court decisions

Description Usage Arguments Value Note Examples

corpus2dtm transforms a corpus of decisions from Italian Supreme Court to a document term matrix.

1	corpus2dtm(corpus, stopwords)

`corpus`	a corpus of decisions from Italian Supreme Court.
`stopwords`	a character vector of stopwords.

dtm a base document-term matrix with minimum term length 3 and terms appearing at least in 5 documents.

Basic text cleansing steps build a base-dtm by selecting only terms (columns) corresponding to a suitable vocabulary. Typically, this involves converting tokens to lower-case, removing punctuation characters, removing numbers, stemming, removing stop-words and selecting terms with a length above a certain minimum and occurring at least in a minimum number of documents. Package tm version >= 0.6 required.

## Not run: 
library(Supreme)
data("corpus")
data("italianStopWords")  # for removing italian stop words
dtm <- corpus2dtm(corpus, italianStopWords)

## End(Not run)

paolofantini/Supreme documentation built on May 24, 2019, 6:14 p.m.

paolofantini/Supreme index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

paolofantini/Supreme
Make it easier applying LDA topic models to a corpus of Italian Supreme Court decisions

corpus2dtm: From ISC corpus to a Document-Term Matrix
In paolofantini/Supreme: Make it easier applying LDA topic models to a corpus of Italian Supreme Court decisions

Description

Usage

Arguments

Value

Note

Examples

Related to corpus2dtm in paolofantini/Supreme...

R Package Documentation

Browse R Packages

We want your feedback!

paolofantini/Supreme Make it easier applying LDA topic models to a corpus of Italian Supreme Court decisions

corpus2dtm: From ISC corpus to a Document-Term Matrix In paolofantini/Supreme: Make it easier applying LDA topic models to a corpus of Italian Supreme Court decisions

Description

Usage

Arguments

Value

Note

Examples

Related to corpus2dtm in paolofantini/Supreme...

R Package Documentation

Browse R Packages

We want your feedback!

paolofantini/Supreme
Make it easier applying LDA topic models to a corpus of Italian Supreme Court decisions

corpus2dtm: From ISC corpus to a Document-Term Matrix
In paolofantini/Supreme: Make it easier applying LDA topic models to a corpus of Italian Supreme Court decisions