LatentSemanticAnalysis: Latent Semantic Analysis model

Description Usage Format Usage Methods Arguments Examples

Description

Creates LSA(Latent semantic analysis) model. See https://en.wikipedia.org/wiki/Latent_semantic_analysis for details.

Usage

1
2
3

Format

R6Class object.

Usage

For usage details see Methods, Arguments and Examples sections.

1
2
3
4
lsa = LatentSemanticAnalysis$new(n_topics, method = c("randomized", "irlba"))
lsa$fit_transform(x, ...)
lsa$transform(x, ...)
lsa$components

Methods

$new(n_topics)

create LSA model with n_topics latent topics

$fit_transform(x, ...)

fit model to an input sparse matrix (preferably in dgCMatrix format) and then transform x to latent space

$transform(x, ...)

transform new data x to latent space

Arguments

lsa

A LSA object.

x

An input document-term matrix. Preferably in dgCMatrix format

n_topics

integer desired number of latent topics.

method

character, one of c("randomized", "irlba"). Defines underlying SVD algorithm. For very large data "randomized" usually works faster and more accurate.

...

Arguments to internal functions. Notably useful for fit_transform() - these arguments will be passed to irlba or svdr functions which are used as backend for SVD.

Examples

1
2
3
4
5
6
7
8
9
data("movie_review")
N = 100
tokens = word_tokenizer(tolower(movie_review$review[1:N]))
dtm = create_dtm(itoken(tokens), hash_vectorizer())
n_topics = 10
lsa_1 = LatentSemanticAnalysis$new(n_topics)
d1 = lsa_1$fit_transform(dtm)
# the same, but wrapped with S3 methods
d2 = fit_transform(dtm, lsa_1)

text2vec documentation built on Jan. 12, 2018, 1:04 a.m.