Description Usage Arguments Details Value Note Examples
reduce_dtm_lognet
reduces the number of terms (columns) of a labeled document-term matrix.
reduce_dtm_lognet
is called by the reduce_dtm
function.
1 | reduce_dtm_lognet(dtm, classes, SEED, c_normalize = TRUE, export = FALSE)
|
dtm |
a document-term matrix in term frequency format. |
classes |
factor, the labeling variable. |
SEED |
integer, the random seed for selecting train and test set. |
c_normalize |
a Boolean value indicating whether the |
export |
logical. If |
This function applies lognet
method, a logistic classification method from package
glmnet, to a labeled document-term matrix.
If c_normalize = TRUE
(default) the input dtm
is passed for cosine normalization
to the wTfIdf
function.
Reduction of number of terms is performed by selecting only columns corresponding
to the non zero beta coefficients in the optimal fit.
a list with the reduced dtm
(in term frequency format)
and train and test misclassification errors err0.train
and err0.test
.
Confusion matrix is also returned.
alpha
and lambda
are tuning parameters of the lognet method:
alpha = 1
(default) and the best lambda
value, corresponding to the
optimal fit, is associated with the minimum training error.
1 2 3 4 5 6 7 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.