textmodel_svm: Linear SVM classifier for texts
In quanteda.textmodels: Scaling Models and Classifiers for Textual Data

textmodel_svm

R Documentation

Linear SVM classifier for texts

Description

Fit a fast linear SVM classifier for texts, using the LiblineaR package.

Usage

textmodel_svm(
  x,
  y,
  weight = c("uniform", "docfreq", "termfreq"),
  type = 1,
  ...
)

Arguments

`x`	the dfm on which the model will be fit. Does not need to contain only the training documents.
`y`	vector of training labels associated with each document identified in `train`. (These will be converted to factors if not already factors.)
`weight`	weights for different classes for imbalanced training sets, passed to `wi` in `LiblineaR::LiblineaR()`. `"uniform"` uses default; `"docfreq"` weights by the number of training examples, and `"termfreq"` by the relative sizes of the training classes in terms of their total lengths in tokens.
`type`	argument passed to the `type` argument in `LiblineaR::LiblineaR()`; default is `1` for L2-regularized L2-loss support vector classification (dual)
`...`	additional arguments passed to `LiblineaR::LiblineaR()`

Value

an object of class textmodel_svm, a list containing:

x, y, weights, type: argument values from the call parameters
algorithm character label of the algorithm used in the call to LiblineaR::LiblineaR()
classnames levels of y
bias the value of Bias returned from LiblineaR::LiblineaR()
svmlinfitted the fitted model object passed from the call to LiblineaR::LiblineaR()]
call the model call

References

R. E. Fan, K. W. Chang, C. J. Hsieh, X. R. Wang, and C. J. Lin. (2008) LIBLINEAR: A Library for Large Linear Classification. Journal of Machine Learning Research 9: 1871-1874. https://www.csie.ntu.edu.tw/~cjlin/liblinear/.

Examples

# use party leaders for govt and opposition classes
library("quanteda")
docvars(data_corpus_irishbudget2010, "govtopp") <-
    c(rep(NA, 4), "Gov", "Opp", NA, "Opp", NA, NA, NA, NA, NA, NA)
dfmat <- dfm(tokens(data_corpus_irishbudget2010))
tmod <- textmodel_svm(dfmat, y = dfmat$govtopp)
predict(tmod)

# multiclass problem - all party leaders
tmod2 <- textmodel_svm(dfmat,
    y = c(rep(NA, 3), "SF", "FF", "FG", NA, "LAB", NA, NA, "Green", rep(NA, 3)))
predict(tmod2)

quanteda.textmodels documentation built on Sept. 11, 2024, 8:19 p.m.