phraseDoc: phraseDoc Creation

View source: R/phraseDoc.R

phraseDocR Documentation

phraseDoc Creation

Description

Create an object of class phraseDoc. This will hold all principal phrases of a collection of texts that occur a minimum number of times, plus the texts they occur in and their position within those texts.

Usage

phraseDoc(
  co,
  mn = 2,
  mx = 8,
  ssw = stopStartWords(),
  sew = stopEndWords(),
  sp = stopPhrases(),
  min.freq = 2,
  principal = function(phrase, freq) {
     freq >= min.freq
 },
  max.phrases = 1500,
  shiny = FALSE,
  silent = FALSE
)

Arguments

co

A corpus or a character vector with each element the text of a document.

mn

Minimum number of words in a phrase.

mx

Maximum number of words in a phrase.

ssw

A set of words no phrase should start with.

sew

A set of words no phrase should end with.

sp

A set of phrases to be excluded.

min.freq

The minimum frequency of phrases to be included.

principal

Function that determines if a phrase is a principal phrase. By default, FALSE is returned if the phrase occurs less often than the number in min.freq.

max.phrases

Maximum number of phrases to be included.

shiny

TRUE if called from a shiny program. This will allow progress to be recorded on a progress meter; the function uses about 100 progress steps, so it should be created inside a withProgress function with the argument max set to at least 100.

silent

TRUE if you do not want progress messages.

Value

Object of class phraseDoc

Examples

tst=c("This is a test text",
      "This is a test text 2",
      "This is another test text",
      "This is another test text 2",
      "This girl will test text that man",
      "This boy will test text that man")
phraseDoc(tst)

phm documentation built on June 8, 2022, 1:05 a.m.