ft_control: Default Control Settings
In fastTextR: An Interface to the 'fastText' Library

ft_control

R Documentation

Default Control Settings

Description

A auxiliary function for defining the control variables.

Usage

ft_control(
  loss = c("softmax", "hs", "ns"),
  learning_rate = 0.05,
  learn_update = 100L,
  word_vec_size = 100L,
  window_size = 5L,
  epoch = 5L,
  min_count = 5L,
  min_count_label = 0L,
  neg = 5L,
  max_len_ngram = 1L,
  nbuckets = 2000000L,
  min_ngram = 3L,
  max_ngram = 6L,
  nthreads = 1L,
  threshold = 1e-04,
  label = "__label__",
  verbose = 0,
  pretrained_vectors = "",
  output = "",
  save_output = FALSE,
  seed = 0L,
  qnorm = FALSE,
  retrain = FALSE,
  qout = FALSE,
  cutoff = 0L,
  dsub = 2L,
  autotune_validation_file = "",
  autotune_metric = "f1",
  autotune_predictions = 1L,
  autotune_duration = 300L,
  autotune_model_size = ""
)

Arguments

`loss`	a character string giving the name of the loss function allowed values are `'softmax'`, `'hs'` and `'ns'`.
`learning_rate`	a numeric giving the learning rate, the default value is `0.05`.
`learn_update`	an integer giving after how many tokens the learning rate should be updated. The default value is `100L`, which means the learning rate is updated every 100 tokens.
`word_vec_size`	an integer giving the length (size) of the word vectors.
`window_size`	an integer giving the size of the context window.
`epoch`	an integer giving the number of epochs.
`min_count`	an integer giving the minimal number of word occurences.
`min_count_label`	and integer giving the minimal number of label occurences.
`neg`	an integer giving how many negatives are sampled (only used if loss is `"ns"`).
`max_len_ngram`	an integer giving the maximum length of ngrams used.
`nbuckets`	an integer giving the number of buckets.
`min_ngram`	an integer giving the minimal ngram length.
`max_ngram`	an integer giving the maximal ngram length.
`nthreads`	an integer giving the number of threads.
`threshold`	a numeric giving the sampling threshold.
`label`	a character string specifying the label prefix (default is `'__label__'`).
`verbose`	an integer giving the verbosity level, the default value is `0L` and shouldn't be changed since Rcpp::Rcout cann't handle the traffic.
`pretrained_vectors`	a character string giving the file path to the pretrained word vectors which are used for the supervised learning.
`output`	a character string giving the output file path.
`save_output`	a logical (default is `FALSE`)
`seed`	an integer
`qnorm`	a logical (default is `FALSE`)
`retrain`	a logical (default is `FALSE`)
`qout`	a logical (default is `FALSE`)
`cutoff`	an integer (default is `0L`)
`dsub`	an integer (default is `2L`)
`autotune_validation_file`	a character string
`autotune_metric`	a character string (default is `"f1"`)
`autotune_predictions`	an integer (default is `1L`)
`autotune_duration`	an integer (default is `300L`)
`autotune_model_size`	a character string