specificities | R Documentation |
Calculate the specificity - or association or surprise -
score of a word being present f
times or more
in a sub-corpus of t
words given that it appears
a total of F
times in a whole corpus of T
words.
specificities(lexicaltable, types=NULL, parts=NULL)
lexicaltable |
a complete lexical table, i.e. a numeric matrix where each line represents a word and each column a part of the corpus. Each cell gives the frequency of the given word in the corresponding part of the corpus. |
types |
list of rows (words) for which the specificity score must be calculated.
If |
parts |
list of columns (parts) for which the specificity score must be calculated.
If |
Returns a matrix of nrow(lexicaltable) * ncol(lexicaltable)
(the number of
rows and columns may be reduced using types
or parts
), each cell
giving the specificity score.
Matthieu Decorde, Serge Heiden, Sylvain Loiseau, Lise Vaudor
Lafon P. (1980) Sur la variabilit\'e de la fr\'e quence des formes dans un corpus, Mots, 1, pp. 127–165. https://www.persee.fr/doc/mots_0243-6450_1980_num_1_1_1008
specificities.probabilities
, specificities.lexicon
data(robespierre); spe <- specificities(robespierre); string <- paste("The word %s appears f=%d times in a sub-corpus of t=%d words,", " given a total frequency of F=%d in the robespierre corpus made", " of T=%d words. The corresponding specificity score is %f", sep=""); print(sprintf(string, 'peuple', robespierre['peuple','D4'], colSums(robespierre)['D4'], rowSums(robespierre)['peuple'], sum(robespierre), spe['peuple', 'D4']));
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.