Description Usage Arguments Value Examples
A method designed for corpora from the PolMine corpora of plenary protocols. A partition is split into speeches.
1 2 3 | ## S4 method for signature 'partition'
as.speeches(.Object, sAttributeDates, sAttributeNames,
gap = 500, mc = FALSE, verbose = TRUE, progress = TRUE)
|
.Object |
a partition .Object |
sAttributeDates |
the s-attribute that provides the dates of sessions |
sAttributeNames |
the s-attribute that provides the names of speakers |
gap |
number of tokens between strucs to identify speeches |
mc |
whether to use multicore, defaults to FALSE |
verbose |
logical, defaults to TRUE |
progress |
logical |
a partitionBundle object
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ## Not run:
use(polmineR.sampleCorpus)
bt <- partition("PLPRBTTXT", text_year="2009")
speeches <- as.speeches(bt, sAttributeDates="text_date", sAttributeNames="text_name")
# step-by-step, not the fastest way
speeches <- enrich(speeches, pAttribute="word")
tdm <- as.TermDocumentMatrix(speeches, col="count")
# fast option (counts performed when assembling the sparse matrix)
tdm <- as.TermDocumentMatrix(speeches, pAttribute="word")
termsToDropList <- noise(tdm)
whatToDrop <- c("stopwords", "specialChars", "numbers", "minNchar")
termsToDrop <- unlist(lapply(whatToDrop, function(x) termsToDropList[[x]]))
tdm <- trim(tdm, termsToDrop = termsToDrop)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.