mcLDA: Multicore parallel runs of LDA models

Description Usage Arguments Details Value Note Examples

Description

mcLDA runs multiple LDA models using parallel foreach to fit each model. The number of topics k is varied over a predefined grid of values and model selection is performed by calling internal compClass function.

Usage

1
2
3
mcLDA(dtm, lda.method = c("VEM", "VEM_fixed", "Gibbs"), k.runs = list(from =
  2, to = 2, steps = 1), classes, train.glmnet = FALSE, cv.parallel = FALSE,
  train.parallel = FALSE)

Arguments

dtm

a document-term matrix.

lda.method

character. Approximate posterior inference method.

k.runs

the grid of k values.

classes

factor. The labeling variable for logistic classification.

train.glmnet

logical. If TRUE also run Method2 in the internal compClass function. Default is FALSE.

cv.parallel

logical. If TRUE parallel computation is used in Method1 in compClass function with the maximum number of available cores. Default is FALSE.

train.parallel

logical. If TRUE parallel computation is used in Method2 in compClass with the maximum number of available cores. Default is FALSE.

Details

This function runs multiple LDA models and applies logistic classification by internal compClass function to each model. A vector of misclassification error on the test set (e1.test) is returned and the best model is selected with the minimum misclassification error.

Value

a list containing the fitted LDA models and the misclassification errors. Model with the minimum misclassification error, i.e. the best model, is also returned.

Note

By default the doParallel package uses snow-like functionality. The snow-like functionality should work fine on Unix-like systems. Actual version of mcLDA function is built on Windows system. In this system it is needed to pass to each core each used package. Output is automatically saved in directory data/ws/output and a log file is provided in directory log.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## Not run: 
library(Supreme)
data("dtm")
data("classes")
dtm.lognet <- reduce_dtm(dtm, method = "lognet", classes = classes)

# 4 cores: fit one model for each core.
mc.lda.models <- mcLDA(dtm.lognet$reduced, lda.method = "VEM", k.runs = list(from = 10, to = 25, steps = 5), classes = classes)

## End(Not run)

paolofantini/Supreme documentation built on May 24, 2019, 6:14 p.m.