enron.sample | R Documentation |
A small sample of the Enron corpus comprising ten authors with approximately the same amount of data. Each author has one text labelled as 'unknown' and the other texts labelled as 'known'. The data was pre-processed using the POSnoise algorithm to mask content (see contentmask()
).
enron.sample
A quanteda
corpus object.
Halvani, Oren. 2021. Practice-Oriented Authorship Verification. Technical University of Darmstadt PhD Thesis. https://tuprints.ulb.tu-darmstadt.de/19861/
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.