calc_m_prob: Calculates what m_prob should be, takes in a dataframe and...

calc_m_probR Documentation

Calculates what m_prob should be, takes in a dataframe and returns a vector of same length this dataframe must have a column of name n_comparisons and it will return a vector. which indicate the m_probs This is the default calculator for m prob of tokens it can be overwritten by injecting a diferent function into the m_prob_func parameter into any of tokenize_ations_m_u_prob, or token_links when called. This function must take a dataframe and return a vector. The dataframe passed in will have columns token, token_type, n.x, n.y, n_comparisons, u_prob

Description

Calculates what m_prob should be, takes in a dataframe and returns a vector of same length this dataframe must have a column of name n_comparisons and it will return a vector. which indicate the m_probs This is the default calculator for m prob of tokens it can be overwritten by injecting a diferent function into the m_prob_func parameter into any of tokenize_ations_m_u_prob, or token_links when called. This function must take a dataframe and return a vector. The dataframe passed in will have columns token, token_type, n.x, n.y, n_comparisons, u_prob

Usage

calc_m_prob(
  dat_token_info,
  min_m_prob = M_PROB_MIN,
  max_m_prob = M_PROB_MAX,
  log_base = m_PROB_LOG_BASE,
  ...
)

Arguments

dat_token_info

a dataframe with information about the tokens

min_m_prob

minimum value of m_prob returned

max_m_prob

maximum value of m_prob returned

log_base

Number. Base of the log. Default 10

...

is ignored

Examples


dplyr::tibble(n_comparisons = sample.int(100, 10)) |> calc_m_prob()



csps-efpc/TokenLink documentation built on Feb. 10, 2023, 3:30 a.m.