View source: R/calc_assoc_metrics.R
calc_assoc_metrics | R Documentation |
This function calculates various association metrics (PMI, Dice's Coefficient, G-score) for bigrams in a given corpus.
calc_assoc_metrics(
data,
doc_index,
token_index,
type,
association = "all",
verbose = FALSE
)
data |
A data frame containing the corpus. |
doc_index |
Column in 'data' which represents the document index. |
token_index |
Column in 'data' which represents the token index. |
type |
Column in 'data' which represents the tokens or terms. |
association |
A character vector specifying which metrics to calculate. Can be any combination of 'pmi', 'dice_coeff', 'g_score', or 'all'. Default is 'all'. |
verbose |
A logical value indicating whether to keep the intermediate probability columns. Default is FALSE. |
A data frame with one row per bigram and columns for each calculated metric.
data_path <- system.file("extdata", "bigrams_data.rds", package = "qtkit")
data <- readRDS(data_path)
calc_assoc_metrics(data, doc_index, token_index, type)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.