Description Usage Arguments Value References Examples
This function combines several of koRpus
' methods to extract the 9-Feature Set for
authorship detection (Brannon, Afroz & Greenstadt, 2011; Brannon & Greenstadt, 2009).
1 | textFeatures(text, hyphen = NULL)
|
text |
An object of class |
hyphen |
An object of class |
A data.frame:
Number of unique words (tokens)
Complexity (TTR)
Sentence count
Average sentence length
Average syllable count
Character count (all characters, including spaces)
Letter count (without spaces, punctuation and digits)
Gunning FOG index
Flesch Reading Ease index
Brennan, M., Afroz, S., & Greenstadt, R. (2011). Deceiving authorship detection. Presentation at 28th Chaos Communication Congress (28C3), Berlin, Germany. Brennan, M. & Greenstadt, R. (2009). Practical Attacks Against Authorship Recognition Techniques. In Proceedings of the Twenty-First Conference on Innovative Applications of Artificial Intelligence (IAAI), Pasadena, CA. Tweedie, F.J., Singh, S., & Holmes, D.I. (1996). Neural Network Applications in Stylometry: The Federalist Papers. Computers and the Humanities, 30, 1–10.
1 2 3 4 5 6 7 8 9 10 11 | # code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
sample_file <- file.path(
path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
)
tokenized.obj <- tokenize(
txt=sample_file,
lang="en"
)
textFeatures(tokenized.obj)
} else {}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.