forensic_fasttext: Does 'forensics' on single FastText model

Description Usage Arguments Value Examples

View source: R/forensic_fasttext.R

Description

Use to investigate where your choosen set of parameters for a fasttext model are lacking. Provides:

Usage

1
forensic_fasttext(k, texts, text_ids, labels, parameters, seed)

Arguments

k

The number of k-folds for each combination set of parameters to test. Defaults to 5.

texts

The texts given by the user to classify later.

labels

The labels for the texts given by the user train the FastText model.

parameters

A df that contains all the different combinations of paramters for a FastText model. Must include the following:

  • lr = learning rate

  • epoch = # of epochs

  • dim = dimensions

  • ws = window size

  • wordNgrams = word n-grams

  • minn = min of character n-grams

  • maxn = max of character n-grams

seed

A number for set.seed when partioning data for have model reproducability.

texts_ids

The text_ids in the text to output nice clean format.

Value

A dataframe with the average and SD accuracy for each row of parameters.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
fast.text.parameters <- expand.grid(
  lr = seq(4, 4.3, 0.5),
  epoch = seq(30, 33, 10),
  dim = seq(100,120, 25),
  ws = seq(4, 6, 2),
  wordNgrams = 2,
  minn = 2,
  maxn = 6
  )

tune_fasttext(k = 5,
              texts = df$mytext,
              text_ids = df$text_id,
              labels = df$topic,
              parameters = fast.text.parameters,
              seed = 123)

jcgonzalez14/textwhiz documentation built on Aug. 26, 2020, 9:39 a.m.