is_gibberish: Gibberish detection for textual data

Description Usage Arguments Value Examples

View source: R/detect_gibber.R

Description

Assess whether a sentence contains gibberish words. For each word in a sentence, a Markov chain inspects the sequence of vowels and consonents to estimate whether a sentence consists of natural words. Therefore, words like 'asdfg' and 'dfrgfh' are considered unnatural and are classified as gibberish. The reliability of the model increases when your textual data contains more characters.

Usage

1
is_gibberish(text, threshold = 0.00345)

Arguments

text

character. Words or sentences.

threshold

numeric. Cutoff to classify text as either gibber or not.

Value

logical

Examples

1
2
text <- c("They don't want to know", "asdfg")
is_gibberish(text)

Glender/gibberlite documentation built on Dec. 17, 2021, 10:21 p.m.