reinSummary | R Documentation |
This function summarizes the results of the Reinert clustering algorithm, including the most frequent documents and significant terms for each cluster.
The input is the result returned by the term_per_cluster
function.
reinSummary(tc, n = 10)
tc |
A list returned by the
|
n |
Integer. The number of top terms (based on Chi-squared value) to include in the summary for each cluster and sign. Default is 10. |
This function performs the following steps:
Extracts the most frequent document for each cluster.
Summarizes the number of documents per cluster.
Selects the top n
terms for each cluster, separated by positive and negative signs.
Combines the terms and segment information into a final summary table.
A data frame summarizing the clustering results. The table includes:
cluster
: The cluster ID.
Positive terms
: The top n
positive terms for each cluster, concatenated into a single string.
Negative terms
: The top n
negative terms for each cluster, concatenated into a single string.
Most frequent document
: The document ID that appears most frequently in each cluster.
N. of Documents per Cluster
: The number of documents in each cluster.
term_per_cluster
, reinPlot
data(mobydick)
res <- reinert(
x = mobydick,
k = 10,
term = "token",
segment_size = 40,
min_segment_size = 5,
min_split_members = 10,
cc_test = 0.3,
tsj = 3
)
tc <- term_per_cluster(res, cutree = NULL, k = 1:10, negative = FALSE)
S <- reinSummary(tc, n = 10)
head(S, 10)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.