distances: Pairwise Distance Matrix Computation In text2vec: Modern Text Mining Framework for R

Description

dist2 calculates pairwise distances/similarities between the rows of two data matrices. Note that some methods work only on sparse matrices and others work only on dense matrices.

pdist2 calculates "parallel" distances between the rows of two data matrices.

Usage

 1 2 3 4 5 dist2(x, y = NULL, method = c("cosine", "euclidean", "jaccard"), norm = c("l2", "l1", "none")) pdist2(x, y, method = c("cosine", "euclidean", "jaccard"), norm = c("l2", "l1", "none"))

Arguments

 x first matrix. y second matrix. For dist2 y = NULL set by default. This means that we will assume y = x and calculate distances/similarities between all rows of the x. method usually character or instance of tet2vec_distance class. The distances/similarity measure to be used. One of c("cosine", "euclidean", "jaccard") or RWMD. RWMD works only on bag-of-words matrices. In case of "cosine" distance max distance will be 1 - (-1) = 2 norm character = c("l2", "l1", "none") - how to scale input matrices. If they already scaled - use "none"

Details

Computes the distance matrix computed by using the specified method. Similar to dist function, but works with two matrices.

pdist2 takes two matrices and return a single vector. giving the ‘parallel’ distances of the vectors.

Value

dist2 returns matrix of distances/similarities between each row of matrix x and each row of matrix y.

pdist2 returns vector of "parallel" distances between rows of x and y.

text2vec documentation built on March 26, 2020, 7:48 p.m.