delta | R Documentation |
This function runs a Cosine Delta analysis (Smith and Aldridge 2011; Evert et al. 2017).
delta(
q.data,
k.data,
tokens = "word",
remove_punct = FALSE,
remove_symbols = TRUE,
remove_numbers = TRUE,
lowercase = TRUE,
n = 1,
trim = TRUE,
threshold = 150,
features = FALSE,
cores = NULL
)
q.data |
The questioned or disputed data, either as a corpus (the output of |
k.data |
The known or undisputed data, either as a corpus (the output of |
tokens |
The type of tokens to extract, either "word" (default) or "character". |
remove_punct |
A logical value. FALSE (default) keeps punctuation marks. |
remove_symbols |
A logical value. TRUE (default) removes symbols. |
remove_numbers |
A logical value. TRUE (default) removes numbers |
lowercase |
A logical value. TRUE (default) transforms all tokens to lower case. |
n |
The order or size of the n-grams being extracted. Default is 1. |
trim |
A logical value. If TRUE (default) then only the most frequent tokens are kept. |
threshold |
A numeric value indicating how many most frequent tokens to keep if trim = TRUE. The default is 150. |
features |
Logical with default FALSE. If TRUE, then the output will contain the features used. |
cores |
The number of cores to use for parallel processing (the default is one). |
If features is set to FALSE then the output is a data frame containing the results of all comparisons between the Q texts and the K texts. If features is set to TRUE then the output is a list containing the results data frame and the vector of features used for the analysis.
Evert, Stefan, Thomas Proisl, Fotis Jannidis, Isabella Reger, Steffen Pielström, Christof Schöch & Thorsten Vitt. 2017. Understanding and explaining Delta measures for authorship attribution. Digital Scholarship in the Humanities 32. ii4–ii16. https://doi.org/10.1093/llc/fqx023. Smith, Peter W H & W Aldridge. 2011. Improving Authorship Attribution: Optimizing Burrows’ Delta Method*. Journal of Quantitative Linguistics 18(1). 63–88. https://doi.org/10.1080/09296174.2011.533591.
Q <- enron.sample[c(5:6)]
K <- enron.sample[-c(5:6)]
delta(Q, K)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.