Description Usage Arguments Details Value Author(s) References Examples
Trains a multinomial logistic regression model of class maxent-class
given a matrix
or matrix.csr
with training data, and a vector
or factor
with corresponding labels. Additional parameters such as feature_cutoff
, gaussian_prior
, inequality_constraints
, and set_heldout
help prevent model overfitting.
1 2 |
feature_matrix |
A DocumentTermMatrix or TermDocumentMatrix (package tm), Matrix (package Matrix), matrix.csr (SparseM), data.frame, or matrix. |
code_vector |
A |
l1_regularizer |
An |
l2_regularizer |
An |
use_sgd |
A |
set_heldout |
An |
verbose |
A |
Yoshimasa Tsuruoka recommends using one of following three methods if you see overfitting.
1. Set the l1_regularizer
parameter to 1.0
, leaving l2_regularizer
and set_heldout
as default.
2. Set the l2_regularizer
parameter to 1.0
, leaving l1_regularizer
and set_heldout
as default.
3. Set the set_heldout
parameter to hold-out a portion of your data, leaving l1_regularizer
and l2_regularizer
as default.
If you are using a large number of training samples, try setting the use_sgd
parameter to TRUE
.
Returns an object of class maxent-class
with two slots.
model |
A |
weights |
A |
Timothy P. Jurka <tpjurka@ucdavis.edu>
Y. Tsuruoka. "A simple C++ library for maximum entropy classification." University of Tokyo Department of Computer Science (Tsujii Laboratory), 2011. URL http://www-tsujii.is.s.u-tokyo.ac.jp/~tsuruoka/maxent/.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | # LOAD LIBRARY
library(maxent)
# READ THE DATA, PREPARE THE CORPUS, and CREATE THE MATRIX
data <- read.csv(system.file("data/NYTimes.csv.gz",package="maxent"))
corpus <- Corpus(VectorSource(data$Title[1:150]))
matrix <- DocumentTermMatrix(corpus)
# TRAIN USING SPARSEM REPRESENTATION
sparse <- as.compressed.matrix(matrix)
model <- maxent(sparse[1:100,],as.factor(data$Topic.Code)[1:100])
# A DIFFERENT EXAMPLE (taken from package e10711)
# CREATE DATA
x <- seq(0.1, 5, by = 0.05)
y <- log(x) + rnorm(x, sd = 0.2)
# ESTIMATE MODEL AND PREDICT INPUT VALUES
m <- maxent(x, y)
new <- predict(m, x)
# VISUALIZE
plot(x, y)
points(x, log(x), col = 2)
points(x, new[,1], col = 4)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.