vwtrain: Train Vowpal Wabbit model

Description Usage Arguments Examples

Description

vwtrain is an interface to train VW model from vwsetup

Usage

1
2
3
4
5
vwtrain(vwmodel, data, readable_model = NULL, readable_model_path = "",
  quiet = FALSE, update_model = FALSE, passes = 1L, cache = FALSE,
  progress = NULL, namespaces = NULL, keep_space = NULL,
  fixed = NULL, targets = NULL, probabilities = NULL,
  weight = NULL, base = NULL, tag = NULL, multiline = NULL)

Arguments

vwmodel

[vw] Model of vw class to train

data

[string or data.frame] Path to training data in .vw plain text format or data.frame. If [data.frame] then will be parsed using df2vw function.

readable_model

[string] Print trained model in human readable format ("hashed") and also with human readable features ("inverted")

readable_model_path

[string] Path to file where to save readable model.

quiet

[logical] Do not print anything to the console

update_model

[logical] Update an existing model, when training with new data. FALSE by default.

passes

[int] Number of times the algorithm will cycle over the data (epochs).

cache

[bool] Use a cache for a data file.

progress

[int/real] Progress update frequency. int: additive, real: multiplicative

namespaces

[list or yaml file] For df2vw. Name of each namespace and each variable for each namespace can be a R list, or a YAML file example namespace with the IRIS database: namespaces = list(sepal = list('Sepal.Length', 'Sepal.Width'), petal = list('Petal.Length', 'Petal.Width') this creates 2 namespaces (sepal and petal) containing the features defined by elements of this lists.

keep_space

[string vector] For df2vw. Keep spaces for this features Example:"FERRARI 4Si" With keep_space will be "FERRARI 4Si" and will be treated as two features Without keep_space will be "FERRARI_4Si" and will be treated as one feature

fixed

[string vector] fixed parsing for this features Similar to keep_space, but parse features exactly without replacement of special characters ("(", ")", "|", ":", "'"). Can be used for LDA ("word_1:2 word_2:3" will stay the same), but should be used carefully, because special characters can ruin final VW format file.

targets

[string or string vector] For df2vw. If [string] then will be treated as vector with real number labels for regular VW input format. If [string vector] then will be treated as vectors with class costs for wap and csoaa multi-class classification algorithms or as vectors with actions for Contextual Bandit algorithm.

probabilities

[string vector] For df2vw. vectors with action probabilities for Contextual Bandit algorithm.

weight

[string] For df2vw. Weight (importance) of each line of the dataset.

base

[string] For df2vw. base of each line of the dataset. Used for residual regression.

tag

[string] For df2vw. Tag of each line of the dataset.

multiline

[integer] number of labels (separate lines) for multilines examle

Examples

1
2
3
ext_train_data <- system.file("extdata", "binary_train.vw", package = "rvw")
test_vwmodel <- vwsetup()
vwtrain(test_vwmodel, data = ext_train_data)

ivan-pavlov/rvwgsoc documentation built on July 1, 2019, 9:40 p.m.