run_LDA: Run Latent Dirichlet Allocation on Tabular Data

Description Usage Arguments Value

View source: R/analysis_LDA.R

Description

Test the Latent Dirichlet Allocation (LDA) model on the data with different number of topics (from 2 to max_topics), select the best one using AIC, and return the model object that is selected.

Usage

1
run_LDA(data, max_topics = 6, nseeds = 200, control = list())

Arguments

data

a data.frame or tibble; each row is an observation (e.g. in time or space), and each column is a variable. Here, the common usage is for each column to be a species or taxon, and each row to be an observed sample. In the original specification for LDA, each row is a document, and each column is a word, with the entries being the counts of the words in each document.

max_topics

the maximum number of topics to try (the function will test a number of topics from 2 to max_topics)

nseeds

Number of seeds (replicate starts) to use for each value of topics. Must be conformable to integer value.

control

A list of parameters to control the running and selecting of LDA models. Values not input assume default values set by LDA_set_control. Values for running the LDAs replace defaults in (LDAcontol, see LDA (but if seed is given, it will be overwritten; use iseed instead).

Value

the best fit model object, from running LDATS::parLDA()


weecology/MATSS-LDATS documentation built on Nov. 5, 2019, 12:07 p.m.