| lma_patcat | R Documentation | 
Categorize raw texts using a pattern-based dictionary.
lma_patcat(text, dict = NULL, pattern.weights = "weight",
  pattern.categories = "category", bias = NULL, to.lower = TRUE,
  return.dtm = FALSE, drop.zeros = FALSE, exclusive = TRUE,
  boundary = NULL, fixed = TRUE, globtoregex = FALSE,
  name.map = c(intname = "_intercept", term = "term"),
  dir = getOption("lingmatch.dict.dir"))
| text | A vector of text to be categorized. Texts are padded by 2 spaces, and potentially lowercased. | 
| dict | At least a vector of terms (patterns), usually a matrix-like object with columns for terms, categories, and weights. | 
| pattern.weights | A vector of weights corresponding to terms in  | 
| pattern.categories | A vector of category names corresponding to terms in  | 
| bias | A constant to add to each category after weighting and summing. Can be a vector with names
corresponding to the unique values in  | 
| to.lower | Logical indicating whether  | 
| return.dtm | Logical; if  | 
| drop.zeros | logical; if  | 
| exclusive | Logical; if  | 
| boundary | A string to add to the beginning and end of each dictionary term. If  | 
| fixed | Logical; if  | 
| globtoregex | Logical; if  | 
| name.map | A named character vector: 
 Missing names are added, so names can be specified positional (e.g.,  | 
| dir | Path to a folder in which to look for  | 
A matrix with a row per text and columns per dictionary category, or (when return.dtm = TRUE)
a sparse matrix with a row per text and column per term. Includes a WC attribute with original
word counts, and a categories attribute with row indices associated with each category if
return.dtm = TRUE.
For applying term-based dictionaries (to a document-term matrix) see lma_termcat().
Other Dictionary functions: 
dictionary_meta(),
download.dict(),
lma_termcat(),
read.dic(),
report_term_matches(),
select.dict()
# example text
text <- c(
  paste(
    "Oh, what youth was! What I had and gave away.",
    "What I took and spent and saw. What I lost. And now? Ruin."
  ),
  paste(
    "God, are you so bored?! You just want what's gone from us all?",
    "I miss the you that was too. I love that you."
  ),
  paste(
    "Tomorrow! Tomorrow--nay, even tonight--you wait, as I am about to change.",
    "Soon I will off to revert. Please wait."
  )
)
# make a document-term matrix with pre-specified terms only
lma_patcat(text, c("bored?!", "i lo", ". "), return.dtm = TRUE)
# get counts of sets of letter
lma_patcat(text, list(c("a", "b", "c"), c("d", "e", "f")))
# same thing with regular expressions
lma_patcat(text, list("[abc]", "[def]"), fixed = FALSE)
# match only words
lma_patcat(text, list("i"), boundary = TRUE)
# match only words, ignoring punctuation
lma_patcat(
  text, c("you", "tomorrow", "was"),
  fixed = FALSE,
  boundary = "\\b", return.dtm = TRUE
)
## Not run: 
# read in the temporal orientation lexicon from the World Well-Being Project
tempori <- read.csv(paste0(
  "https://raw.githubusercontent.com/wwbp/lexica/master/",
  "temporal_orientation/temporal_orientation_lexicon.csv"
))
lma_patcat(text, tempori)
# or use the standardized version
tempori_std <- read.dic("wwbp_prospection", dir = "~/Dictionaries")
lma_patcat(text, tempori_std)
## get scores on the same scale by adjusting the standardized values
tempori_std[, -1] <- tempori_std[, -1] / 100 *
  select.dict("wwbp_prospection")$selected[, "original_max"]
lma_patcat(text, tempori_std)[, unique(tempori$category)]
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.