new_simple_lda_config: new_simple_lda_config

Description Usage Arguments

View source: R/pcldar.R

Description

Create a new LDA config file

Usage

1
2
3
4
5
new_simple_lda_config(dataset_fn, nr_topics = 20, alpha = 0.01,
  beta = (nr_topics/50), iterations = 2000, rareword_threshold = 10,
  optim_interval = -1, stoplist_fn = "stoplist.txt",
  topic_interval = 10, tmpdir = "/tmp",
  topic_priors = "stoplist.txt")

Arguments

dataset_fn

filename of dataset (in LDA format)

nr_topics

number of topics to use

alpha

symmetric alpha prior

beta

symmetric beta prior

iterations

number of iterations to sample

rareword_threshold

min. number of occurences of a word to be kept

optim_interval

how often to do hyperparameter optimization (default is off = -1)

stoplist_fn

filenname of stoplist file (one word per line) (default "stoplist.txt")

topic_interval

how often to print topic info during sampling

tmpdir

temporary directory for intermediate storage of logging data (default "tmp")

topic_priors

text file with 'prior spec' with one topic per line with format: <topic nr(zero idxed)>, <word1>, <word2>, etc


lejon/pcldar documentation built on Feb. 23, 2020, 3:11 p.m.