View source: R/clustering_similarity.R
calc_doc_sim | R Documentation |
This function calculates the similarity between documents using TF-IDF weighting and cosine similarity.
calc_doc_sim(
text_data,
text_column = "abstract",
min_term_freq = 2,
max_doc_freq = 0.9
)
text_data |
A data frame containing text data. |
text_column |
Name of the column containing text to analyze. |
min_term_freq |
Minimum frequency for a term to be included. |
max_doc_freq |
Maximum document frequency (as a proportion) for a term to be included. |
A similarity matrix for the documents.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.