Description Usage Arguments Value Examples
View source: R/data_transformation.R
The term_filter
is for filtering stop_words and low frequency words.
The term_idf
is for computing idf(inverse documents frequency) of terms.
The term_tfidf
is for computing tf-idf of documents.
1 2 3 4 5 | term_tfidf(term_df, idf = NULL)
term_idf(term_df, n_total = NULL)
term_filter(term_df, low_freq = 0.01, stop_words = NULL)
|
term_df |
A data.frame with id and term. |
idf |
A data.frame with idf. |
n_total |
Number of documents. |
low_freq |
Use rate of terms or use numbers of terms. |
stop_words |
Stop words. |
A data.frame
1 2 3 4 5 6 7 | term_df = data.frame(id = c(1,1,1,2,2,3,3,3,4,4,4,4,4,5,5,6,7,7,
8,8,8,9,9,9,10,10,11,11,11,11,11,11),
terms = c('a','b','c','a','c','d','d','a','b','c','a','c','d','a','c',
'd','a','e','f','b','c','f','b','c','h','h','i','c','d','g','k','k'))
term_df = term_filter(term_df = term_df, low_freq = 1)
idf = term_idf(term_df)
tf_idf = term_tfidf(term_df,idf = idf)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.