ft_control: Default Control Settings

View source: R/fasttext.R

ft_controlR Documentation

Default Control Settings

Description

A auxiliary function for defining the control variables.

Usage

ft_control(
  loss = c("softmax", "hs", "ns"),
  learning_rate = 0.05,
  learn_update = 100L,
  word_vec_size = 100L,
  window_size = 5L,
  epoch = 5L,
  min_count = 5L,
  min_count_label = 0L,
  neg = 5L,
  max_len_ngram = 1L,
  nbuckets = 2000000L,
  min_ngram = 3L,
  max_ngram = 6L,
  nthreads = 1L,
  threshold = 1e-04,
  label = "__label__",
  verbose = 0,
  pretrained_vectors = "",
  output = "",
  save_output = FALSE,
  seed = 0L,
  qnorm = FALSE,
  retrain = FALSE,
  qout = FALSE,
  cutoff = 0L,
  dsub = 2L,
  autotune_validation_file = "",
  autotune_metric = "f1",
  autotune_predictions = 1L,
  autotune_duration = 300L,
  autotune_model_size = ""
)

Arguments

loss

a character string giving the name of the loss function allowed values are 'softmax', 'hs' and 'ns'.

learning_rate

a numeric giving the learning rate, the default value is 0.05.

learn_update

an integer giving after how many tokens the learning rate should be updated. The default value is 100L, which means the learning rate is updated every 100 tokens.

word_vec_size

an integer giving the length (size) of the word vectors.

window_size

an integer giving the size of the context window.

epoch

an integer giving the number of epochs.

min_count

an integer giving the minimal number of word occurences.

min_count_label

and integer giving the minimal number of label occurences.

neg

an integer giving how many negatives are sampled (only used if loss is "ns").

max_len_ngram

an integer giving the maximum length of ngrams used.

nbuckets

an integer giving the number of buckets.

min_ngram

an integer giving the minimal ngram length.

max_ngram

an integer giving the maximal ngram length.

nthreads

an integer giving the number of threads.

threshold

a numeric giving the sampling threshold.

label

a character string specifying the label prefix (default is '__label__').

verbose

an integer giving the verbosity level, the default value is 0L and shouldn't be changed since Rcpp::Rcout cann't handle the traffic.

pretrained_vectors

a character string giving the file path to the pretrained word vectors which are used for the supervised learning.

output

a character string giving the output file path.

save_output

a logical (default is FALSE)

seed

an integer

qnorm

a logical (default is FALSE)

retrain

a logical (default is FALSE)

qout

a logical (default is FALSE)

cutoff

an integer (default is 0L)

dsub

an integer (default is 2L)

autotune_validation_file

a character string

autotune_metric

a character string (default is "f1")

autotune_predictions

an integer (default is 1L)

autotune_duration

an integer (default is 300L)

autotune_model_size

a character string

Value

a list with the control variables.

Examples

ft_control(learning_rate=0.1)

fastTextR documentation built on Oct. 31, 2022, 9:06 a.m.