count_chars_words: Count the frequency of characters and words in a string of...

View source: R/text_fun.R

count_chars_wordsR Documentation

Count the frequency of characters and words in a string of text x.

Description

count_chars_words provides frequency counts of the characters and words of a string of text x on a per character basis.

Usage

count_chars_words(x, case_sense = TRUE, sep = "|", rm_sep = TRUE)

Arguments

x

A string of text (required).

case_sense

Boolean: Distinguish lower- vs. uppercase characters? Default: case_sense = TRUE.

sep

Dummy character(s) to insert between elements/lines when parsing a multi-element character vector x as input. This character is inserted to mark word boundaries in multi-element inputs x (without punctuation at the boundary). It should NOT occur anywhere in x, so that it can be removed again (by rm_sep = TRUE). Default: sep = "|" (i.e., insert a vertical bar between lines).

rm_sep

Should sep be removed from output? Default: rm_sep = TRUE.

Details

count_chars_words calls both count_chars and count_words and maps their results to a data frame that contains a row for each character of x.

The quantifications are case-sensitive. Special characters (e.g., parentheses, punctuation, and spaces) are counted as characters, but removed from word counts.

If input x consists of multiple text strings, they are collapsed with an added " " (space) between them.

Value

A data frame with 4 variables (char, char_freq, word, word_freq).

See Also

count_chars for counting the frequency of characters; count_words for counting the frequency of words; plot_chars for a character plotting function.

Other text objects and functions: Umlaut, capitalize(), caseflip(), cclass, chars_to_text(), collapse_chars(), count_chars(), count_words(), invert_rules(), l33t_rul35, map_text_chars(), map_text_coord(), map_text_regex(), metachar, read_ascii(), text_to_chars(), text_to_sentences(), text_to_words(), transl33t(), words_to_text()

Examples

s1 <- ("This test is to test this function.")
head(count_chars_words(s1))
head(count_chars_words(s1, case_sense = FALSE))

s3 <- c("A 1st sentence.", "The 2nd sentence.", 
        "A 3rd --- and also THE  FINAL --- SENTENCE.")
tail(count_chars_words(s3))
tail(count_chars_words(s3, case_sense = FALSE))


ds4psy documentation built on Sept. 15, 2023, 9:08 a.m.