count_chars_words: Count the frequency of characters and words in a string of...
In ds4psy: Data Science for Psychologists

count_chars_words

R Documentation

Count the frequency of characters and words in a string of text `x`.

Description

count_chars_words provides frequency counts of the characters and words of a string of text x on a per character basis.

Usage

count_chars_words(x, case_sense = TRUE, sep = "|", rm_sep = TRUE)

Arguments

`x`	A string of text (required).
`case_sense`	Boolean: Distinguish lower- vs. uppercase characters? Default: `case_sense = TRUE`.
`sep`	Dummy character(s) to insert between elements/lines when parsing a multi-element character vector `x` as input. This character is inserted to mark word boundaries in multi-element inputs `x` (without punctuation at the boundary). It should NOT occur anywhere in `x`, so that it can be removed again (by `rm_sep = TRUE`). Default: `sep = "\|"` (i.e., insert a vertical bar between lines).
`rm_sep`	Should `sep` be removed from output? Default: `rm_sep = TRUE`.

Details

count_chars_words calls both count_chars and count_words and maps their results to a data frame that contains a row for each character of x.

The quantifications are case-sensitive. Special characters (e.g., parentheses, punctuation, and spaces) are counted as characters, but removed from word counts.

If input x consists of multiple text strings, they are collapsed with an added " " (space) between them.

Value

A data frame with 4 variables (char, char_freq, word, word_freq).

Examples

s1 <- ("This test is to test this function.")
head(count_chars_words(s1))
head(count_chars_words(s1, case_sense = FALSE))

s3 <- c("A 1st sentence.", "The 2nd sentence.", 
        "A 3rd --- and also THE  FINAL --- SENTENCE.")
tail(count_chars_words(s3))
tail(count_chars_words(s3, case_sense = FALSE))

ds4psy documentation built on Sept. 15, 2023, 9:08 a.m.