keep_tokens: Given a dataframe of all tokens, object will return a...

keep_tokensR Documentation

Given a dataframe of all tokens, object will return a dataframe of tokens that is a subset of the dataset

Description

Given a dataframe of all tokens, object will return a dataframe of tokens that is a subset of the dataset

Usage

keep_tokens(
  tokens_all,
  min_token_u_prob = TOKEN_MIN_UPROB_DEFAULT,
  max_total_comparisons = TOKEN_MAX_COMPARE_DEFAULT,
  remove_n_comparisons_zero = TOKEN_REMOVE_ZERO_COMARE,
  ...
)

Arguments

tokens_all

a dataframe normally from t_dat$tokens_all

min_token_u_prob

minimum u_prob to keep, can be NULL to not filter, reasonable values are between 0 and 1, with higher numbers using more tokens and lower numbers using: Default TOKEN_MIN_UPROB_DEFAULT

max_total_comparisons

maximum number of comparisons to allow it will pick tokens with the smallest number of n_comparisons first, NULL is also allowed to not filder : Default 25000000

remove_n_comparisons_zero

Remove tokens that can not be included in comparisons: Default TRUE

...

Ignored


csps-efpc/TokenLink documentation built on Feb. 10, 2023, 3:30 a.m.