Make a table of where phrases appear in a corpus

Description

Generate a n by p phrase count matrix, with n being number of documents and p being number of phrases: \tabularrrrrr 0 \tab 0 \tab 0 \tab 0 \tab 0 \cr 1 \tab 6 \tab 2 \tab 0 \tab 0 \cr 8 \tab 0 \tab 0 \tab 0 \tab 0 This is the phrase equivilent of a document-term matrix.

Usage

1
make.phrase.matrix(phrase_list, corpus)

Arguments

phrase_list

List of strings

corpus

A corpus object from tm package

Value

a n X p matrix, n being number of documents, p being number of phrases.

See Also

Other textregCounting: make.count.table; phrase.count

Examples

1
2
3
4
library( tm )
data( bathtub )
lbl = meta( bathtub )$meth.chl
head( make.phrase.matrix( c("bathtub","strip+", "vapor *"), bathtub ) )

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.