generate_all_tokens | R Documentation |
Generates a dataframe with the total counts of each tokens across both datasets as well as the m and u probs
generate_all_tokens( x_counts, y_counts, total_comparisons, token_count_join = TOKEN_TOKEN_TYPE_VEC, suffix = TOKEN_SUFFIX_DEFAULT, m_prob_func = calc_m_prob, ... )
x_counts |
Counts of tokens from first dataset |
y_counts |
Counts of tokens from second dataset |
total_comparisons |
count of the number of comparisons that can happens normally is nrow(x_dat) * nrow(y_dat) |
token_count_join |
String vector that joins the two token count dataframes. Default c('token','token_type') |
suffix |
String vector of length 2. Helps identify which column the counts came from. Default c('x','y') |
m_prob_func |
Function that takes a dataframe with columns token, token_type, n.x, n.y, n_comparisons, u_prob, and returns a vector of m_probs |
... |
not used |
dat_ceo <- readr::read_csv('https://tinyurl.com/2p8etjr6') dat_alb <- readr::read_csv('https://tinyurl.com/2p8ap4ad') t_dat <- token_links( dat_x = dat_ceo, dat_y = dat_alb, args_x = list(col_nms = 'coname'), args_y = list(col_nms = 'companyName'), token_types = 'company_name', token_index = '', suffix = c('ceo', 'alb') ) results <- generate_all_tokens(t_dat$x$token_counts, t_dat$y$token_counts, t_dat$total_comparisons)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.