trains a maximum entropy model given a training matrix and a vector or factor of labels.

Share:

Description

Trains a multinomial logistic regression model of class maxent-class given a matrix or matrix.csr with training data, and a vector or factor with corresponding labels. Additional parameters such as feature_cutoff, gaussian_prior, inequality_constraints, and set_heldout help prevent model overfitting.

Usage

1
2
maxent(feature_matrix, code_vector, l1_regularizer=0.0, l2_regularizer=0.0,
       use_sgd=FALSE, set_heldout=0, verbose=FALSE)

Arguments

feature_matrix

A DocumentTermMatrix or TermDocumentMatrix (package tm), Matrix (package Matrix), matrix.csr (SparseM), data.frame, or matrix.

code_vector

A factor or vector of labels corresponding to each document in the feature_matrix.

l1_regularizer

An numeric turning on L1 regularization and setting the regularization parameter. A value of 0 will disable L1 regularization.

l2_regularizer

An numeric turning on L2 regularization and setting the regularization parameter. A value of 0 will disable L2 regularization.

use_sgd

A logical indicating that SGD parameter estimation should be used. Defaults to FALSE.

set_heldout

An integer specifying the number of documents to hold out. Sets a held-out subset of your data to test against and prevent overfitting.

verbose

A logical specifying whether to provide descriptive output about the training process. Defaults to FALSE, or no output.

Details

Yoshimasa Tsuruoka recommends using one of following three methods if you see overfitting.

1. Set the l1_regularizer parameter to 1.0, leaving l2_regularizer and set_heldout as default.

2. Set the l2_regularizer parameter to 1.0, leaving l1_regularizer and set_heldout as default.

3. Set the set_heldout parameter to hold-out a portion of your data, leaving l1_regularizer and l2_regularizer as default.

If you are using a large number of training samples, try setting the use_sgd parameter to TRUE.

Value

Returns an object of class maxent-class with two slots.

model

A character vector containing the trained maximum entropy model.

weights

A data.frame listing all the weights in three columns: Weight, Label, and Feature.

Author(s)

Timothy P. Jurka <tpjurka@ucdavis.edu>

References

Y. Tsuruoka. "A simple C++ library for maximum entropy classification." University of Tokyo Department of Computer Science (Tsujii Laboratory), 2011. URL http://www-tsujii.is.s.u-tokyo.ac.jp/~tsuruoka/maxent/.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# LOAD LIBRARY
library(maxent)

# READ THE DATA, PREPARE THE CORPUS, and CREATE THE MATRIX
data <- read.csv(system.file("data/NYTimes.csv.gz",package="maxent"))
corpus <- Corpus(VectorSource(data$Title[1:150]))
matrix <- DocumentTermMatrix(corpus)

# TRAIN USING SPARSEM REPRESENTATION
sparse <- as.compressed.matrix(matrix)
model <- maxent(sparse[1:100,],as.factor(data$Topic.Code)[1:100])

# A DIFFERENT EXAMPLE (taken from package e10711)
# CREATE DATA
x <- seq(0.1, 5, by = 0.05)
y <- log(x) + rnorm(x, sd = 0.2)

# ESTIMATE MODEL AND PREDICT INPUT VALUES
m <- maxent(x, y)
new <- predict(m, x)

# VISUALIZE
plot(x, y)
points(x, log(x), col = 2)
points(x, new[,1], col = 4)