SparseEmbedder: Sparse Embedder (BM25/TF-IDF)

SparseEmbedderR Documentation

Sparse Embedder (BM25/TF-IDF)

Description

Generates sparse BM25 embeddings for keyword search

Public fields

vocab

Vocabulary

language

Language setting ("en" or "ml")

Methods

Public methods


Method new()

Create a new SparseEmbedder

Usage
SparseEmbedder$new(language = "en")
Arguments
language

Language behavior ("en" = ASCII-focused, "ml" = Unicode-aware)


Method fit()

Fit the embedder on a corpus

Usage
SparseEmbedder$fit(texts)
Arguments
texts

Character vector of texts


Method embed()

Embed texts to sparse vectors

Usage
SparseEmbedder$embed(texts)
Arguments
texts

Character vector of texts

Returns

Sparse matrix of BM25 scores


Method query_terms()

Get term scores for a query

Usage
SparseEmbedder$query_terms(query)
Arguments
query

Query text

Returns

Named vector of term scores


Method clone()

The objects of this class are cloneable with this method.

Usage
SparseEmbedder$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


VectrixDB documentation built on Feb. 20, 2026, 5:09 p.m.