textFeatures: Extract text features for authorship analysis
In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

Description Usage Arguments Value References Examples

This function combines several of koRpus' methods to extract the 9-Feature Set for authorship detection (Brannon, Afroz & Greenstadt, 2011; Brannon & Greenstadt, 2009).

1	textFeatures(text, hyphen = NULL)

`text`	An object of class `kRp.text`. Can also be a list of these objects, if you want to analyze more than one text at once.
`hyphen`	An object of class `kRp.hyphen`, if `text` has already been hyphenated. If `text` is a list and `hyphen` is not `NULL`, it must also be a list with one object for each text, in the same order.

A data.frame:

uniqWd: Number of unique words (tokens)
cmplx: Complexity (TTR)
sntCt: Sentence count
sntLen: Average sentence length
syllCt: Average syllable count
charCt: Character count (all characters, including spaces)
lttrCt: Letter count (without spaces, punctuation and digits)
FOG: Gunning FOG index
flesch: Flesch Reading Ease index

Brennan, M., Afroz, S., & Greenstadt, R. (2011). Deceiving authorship detection. Presentation at 28th Chaos Communication Congress (28C3), Berlin, Germany. Brennan, M. & Greenstadt, R. (2009). Practical Attacks Against Authorship Recognition Techniques. In Proceedings of the Twenty-First Conference on Innovative Applications of Artificial Intelligence (IAAI), Pasadena, CA. Tweedie, F.J., Singh, S., & Holmes, D.I. (1996). Neural Network Applications in Stylometry: The Federalist Papers. Computers and the Humanities, 30, 1–10.

# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
  sample_file <- file.path(
    path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
  )
  tokenized.obj <- tokenize(
    txt=sample_file,
    lang="en"
  )
  textFeatures(tokenized.obj)
} else {}

koRpus documentation built on May 18, 2021, 1:13 a.m.

koRpus index

Package overview README.md Using the koRpus Package for Text Analysis

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

koRpus
Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

textFeatures: Extract text features for authorship analysis
In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

Description

Usage

Arguments

Value

References

Examples

Related to textFeatures in koRpus...

R Package Documentation

Browse R Packages

We want your feedback!

koRpus Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

textFeatures: Extract text features for authorship analysis In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

Description

Usage

Arguments

Value

References

Examples

Related to textFeatures in koRpus...

R Package Documentation

Browse R Packages

We want your feedback!

koRpus
Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

textFeatures: Extract text features for authorship analysis
In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity