| generate_all_tokens | R Documentation |
Generates a dataframe with the total counts of each tokens across both datasets as well as the m and u probs
generate_all_tokens( x_counts, y_counts, total_comparisons, token_count_join = TOKEN_TOKEN_TYPE_VEC, suffix = TOKEN_SUFFIX_DEFAULT, m_prob_func = calc_m_prob, ... )
x_counts |
Counts of tokens from first dataset |
y_counts |
Counts of tokens from second dataset |
total_comparisons |
count of the number of comparisons that can happens normally is nrow(x_dat) * nrow(y_dat) |
token_count_join |
String vector that joins the two token count dataframes. Default c('token','token_type') |
suffix |
String vector of length 2. Helps identify which column the counts came from. Default c('x','y') |
m_prob_func |
Function that takes a dataframe with columns token, token_type, n.x, n.y, n_comparisons, u_prob, and returns a vector of m_probs |
... |
not used |
dat_ceo <- readr::read_csv('https://tinyurl.com/2p8etjr6')
dat_alb <- readr::read_csv('https://tinyurl.com/2p8ap4ad')
t_dat <- token_links(
dat_x = dat_ceo,
dat_y = dat_alb,
args_x = list(col_nms = 'coname'),
args_y = list(col_nms = 'companyName'),
token_types = 'company_name',
token_index = '',
suffix = c('ceo', 'alb')
)
results <- generate_all_tokens(t_dat$x$token_counts, t_dat$y$token_counts, t_dat$total_comparisons)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.