textstat_proxy: [Experimental] Compute document/feature proximity

View source: R/textstat_simil.R

textstat_proxyR Documentation

[Experimental] Compute document/feature proximity

Description

This is an underlying function for textstat_dist and textstat_simil but returns TsparseMatrix.

Usage

textstat_proxy(
  x,
  y = NULL,
  margin = c("documents", "features"),
  method = c("cosine", "correlation", "jaccard", "ejaccard", "dice", "edice", "hamann",
    "simple matching", "euclidean", "chisquared", "hamming", "kullback", "manhattan",
    "maximum", "canberra", "minkowski"),
  p = 2,
  min_proxy = NULL,
  rank = NULL,
  use_na = FALSE
)

Arguments

y

if a dfm object is provided, proximity between documents or features in x and y is computed.

margin

identifies the margin of the dfm on which similarity or difference will be computed: "documents" for documents or "features" for word/term features.

method

character; the method identifying the similarity or distance measure to be used; see Details.

p

The power of the Minkowski distance.

min_proxy

the minimum proximity value to be recoded.

rank

an integer value specifying top-n most proximity values to be recorded.

use_na

if TRUE, return NA for proximity to empty vectors. Note that use of NA makes the proximity matrices denser.

See Also

textstat_dist(), textstat_simil()


quanteda.textstats documentation built on Nov. 2, 2023, 5:07 p.m.