Description Usage Arguments Details Value Examples
Given a data frame with clusters, top words by cluster are returned
1 2 3 4 5 6 7 8 9 |
df |
a dataframe with at least a column with textual data, cluster's and documents' IDs |
cluster_field |
name of the column (in quotation marks) containing the clusters' IDs (default NULL) |
docid_field |
name of the column (in quotation marks) containing the documents' ID (default NULL) |
text_field |
name of the column (in quotation marks) containing textual data |
clean |
clean the text from stopwords, punctuation, symbols etc. (default FALSE) |
lang |
if clean=TRUE, langauge of the stopword should be specified. It supports the following languages: danish, dutch, english, finnish, french, german, hungarian, italian, norwegian, portuguese, russian, spanish, and swedish |
n |
number of top words to return |
the most specific words of each clusters are computed through the chi-squared statistics as implemented in textstat_keyness
a data frame with the most frequent and specific words of each cluster
1 2 3 | ## Not run:
top_terms <- clusterm(df, cluster_field = "cluster", text_field = "texts")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.