addLDA: Latent Dirichlet Allocation

addLDAR Documentation

Latent Dirichlet Allocation

Description

These functions perform Latent Dirichlet Allocation on data stored in a TreeSummarizedExperiment object.

Usage

getLDA(x, ...)

addLDA(x, ...)

## S4 method for signature 'SummarizedExperiment'
getLDA(x, k = 2, assay.type = "counts", eval.metric = "perplexity", ...)

## S4 method for signature 'SummarizedExperiment'
addLDA(x, k = 2, assay.type = "counts", name = "LDA", ...)

Arguments

x

a TreeSummarizedExperiment object.

...

optional arguments passed to LDA

k

Integer vector. A number of latent vectors/topics. (Default: 2)

assay.type

Character scalar. Specifies which assay to use for LDA ordination. (Default: "counts")

eval.metric

Character scalar. Specifies evaluation metric that will be used to select the model with the best fit. Must be either "perplexity" (topicmodels::perplexity) or "coherence" (topicdoc::topic_coherence, the best model is selected based on mean coherence). (Default: "perplexity")

name

Character scalar. The name to be used to store the result in the reducedDims of the output. (Default: "LDA")

Details

The functions getLDA and addLDA internally use LDA to compute the ordination matrix and feature loadings.

Value

For getLDA, the ordination matrix with feature loadings matrix as attribute "loadings".

For addLDA, a TreeSummarizedExperiment object is returned containing the ordination matrix in reducedDim(..., name) with feature loadings matrix as attribute "loadings".

Examples

data(GlobalPatterns)
tse <- GlobalPatterns

# Reduce the number of features 
tse <- agglomerateByPrevalence(tse, rank="Phylum")

# Run LDA and add the result to reducedDim(tse, "LDA")
tse <- addLDA(tse)

# Extract feature loadings
loadings <- attr(reducedDim(tse, "LDA"), "loadings")
head(loadings)

# Estimate models with number of topics from 2 to 10
tse <- addLDA(tse, k = c(2, 3, 4, 5, 6, 7, 8, 9, 10), name = "LDA_10")
# Get the evaluation metrics
tab <- attr(reducedDim(tse, "LDA_10"),"eval_metrics")
# Plot
plot(tab[["k"]], tab[["perplexity"]], xlab = "k", ylab = "perplexity")

FelixErnst/mia documentation built on Nov. 18, 2024, 5:02 a.m.