clusterms: Get terms by cluster
In nicolarighetti/textools: an R toolbox for text mining tasks

Given a data frame with clusters, top words by cluster are returned

clusterms(
  df,
  cluster_field = NULL,
  docid_field = NULL,
  text_field = NULL,
  clean = FALSE,
  lang = NULL,
  n = 10
)

`df`	a dataframe with at least a column with textual data, cluster's and documents' IDs
`cluster_field`	name of the column (in quotation marks) containing the clusters' IDs (default NULL)
`docid_field`	name of the column (in quotation marks) containing the documents' ID (default NULL)
`text_field`	name of the column (in quotation marks) containing textual data
`clean`	clean the text from stopwords, punctuation, symbols etc. (default FALSE)
`lang`	if clean=TRUE, langauge of the stopword should be specified. It supports the following languages: danish, dutch, english, finnish, french, german, hungarian, italian, norwegian, portuguese, russian, spanish, and swedish
`n`	number of top words to return

the most specific words of each clusters are computed through the chi-squared statistics as implemented in textstat_keyness

a data frame with the most frequent and specific words of each cluster

1
2
3

## Not run: 
top_terms <- clusterm(df, cluster_field = "cluster", text_field = "texts")
## End(Not run)

nicolarighetti/textools documentation built on Oct. 16, 2021, 11:20 p.m.

nicolarighetti/textools index

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Description