PreProcess: Pre-process texts to create a corpus suitable for...
In christophergandrud/ExpAgenda: Bayesian hierarchical expressed agenda estimation using Grimmer (2010)

Description Usage Arguments Value Source

PreProcess prepares texts and author information for use with ExpAgendaVonmon.

  PreProcess(textsDF = NULL, TextsCol, AuthorCol, IDCol,
    textsPattern = NULL, authorsDF = NULL,
    removeNumbers = TRUE, StopWords = NULL,
    removeAuthors = NULL, sparse = 0.4)

`textsDF`	a data frame containing a column with texts and a column with author names. Unnecessary if `textsDir` and `authorsDF` are set.
`TextsCol`	character string identifying the column in `textsDF` with the texts.
`AuthorCol`	character string identifying the column in either `textsDF` or `authorDF` identifying the authors.
`IDCol`	a character string with the column uniquely identifying each text either in `textsDF` or `authorDF`.
`textsPattern`	character string. Regular expression pattern identifying the texts in `textsDF`. nnecessary if `textDF` is set.
`authorsDF`	a data frame with author information for each text in `textDF`. They must be in the same order. Unnecessary if `textDF` is set.
`removeNumbers`	logical. Whether or not to remove numbers from the texts.
`StopWords`	character vector of stop words to remove. If `StopWords = NULL` (the default) then `tm`'s default English stop word list will be used. See `stopwords`.
`removeAuthors`	character vector. The names of authors to remove.
`sparse`	numeric for the maximal allowed sparsity. See `removeSparseTerms`

Returns an object of class ExpAgendaDTMatrix that can be used with ExpAgendaVonmon to estimated authors' expressed agendas in documents. The object contains three matrices. doc.term is a document term matrix and authors locates the authors of the texts in doc.term. authorID is used for DocTopics to return the documents their their original order.

Feinerer, K. Hornik, and D. Meyer. Text mining infrastructure in R. Journal of Statistical Software, 25(5):1-54, March 2008. http://www.jstatsoft.org/v25/i05.

christophergandrud/ExpAgenda documentation built on May 13, 2019, 7:01 p.m.

christophergandrud/ExpAgenda index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

christophergandrud/ExpAgenda
Bayesian hierarchical expressed agenda estimation using Grimmer (2010)

PreProcess: Pre-process texts to create a corpus suitable for...
In christophergandrud/ExpAgenda: Bayesian hierarchical expressed agenda estimation using Grimmer (2010)

Description

Usage

Arguments

Value

Source

Related to PreProcess in christophergandrud/ExpAgenda...

R Package Documentation

Browse R Packages

We want your feedback!

christophergandrud/ExpAgenda Bayesian hierarchical expressed agenda estimation using Grimmer (2010)

PreProcess: Pre-process texts to create a corpus suitable for... In christophergandrud/ExpAgenda: Bayesian hierarchical expressed agenda estimation using Grimmer (2010)

Description

Usage

Arguments

Value

Source

Related to PreProcess in christophergandrud/ExpAgenda...

R Package Documentation

Browse R Packages

We want your feedback!

christophergandrud/ExpAgenda
Bayesian hierarchical expressed agenda estimation using Grimmer (2010)

PreProcess: Pre-process texts to create a corpus suitable for...
In christophergandrud/ExpAgenda: Bayesian hierarchical expressed agenda estimation using Grimmer (2010)