textstat_proxy: [Experimental] Compute document/feature proximity

Description Usage Arguments See Also

Description

This is an underlying function for textstat_dist and textstat_simil but returns TsparseMatrix.

Usage

1
2
3
4
5
textstat_proxy(x, y = NULL, margin = c("documents", "features"),
  method = c("cosine", "correlation", "jaccard", "ejaccard", "dice",
  "edice", "hamman", "simple matching", "euclidean", "chisquared",
  "hamming", "kullback", "manhattan", "maximum", "canberra", "minkowski"),
  p = 2, min_proxy = NULL, rank = NULL, use_na = FALSE)

Arguments

x

a dfm object

y

if a dfm object is provided, proximity between documents or features in x and y is computed.

margin

identifies the margin of the dfm on which similarity or difference will be computed: "documents" for documents or "features" for word/term features.

method

character; the method identifying the similarity or distance measure to be used; see Details.

p

The power of the Minkowski distance.

min_proxy

the minimum proximity value to be recoded.

rank

an integer value specifying top-n most proximity values to be recorded.

use_na

if TRUE, return NA for proximity to empty vectors. Note that use of NA makes the proximity matrices denser.

See Also

textstat_dist, textstat_simil


quanteda/quanteda documentation built on June 15, 2019, 8:36 a.m.