pair_distances: Similarity and distance computation between documents or...

View source: R/pair_distances.R

pair_distancesR Documentation

Similarity and distance computation between documents or features

Description

These functions compute distance matrices from a text representation where each row is a document and each column is a feature to measure distance over based on treatment indicator Z

Usage

pair_distances(
  dat,
  Z,
  include = c("cosine", "euclidean", "mahalanobis"),
  form = "data.frame",
  verbose = FALSE
)

Arguments

dat

a matrix text representation with rows corresponding to each document in a corpus and columns that represent summary measures of the text (e.g., word counts, topic proportions, etc.). Acceptable forms include a valid quanteda dfm object, a tm Document-Term Matrix, a matrix of estimated topic proportions, or a vector of estimated propensity scores.

Z

a vector of treatment indicators

include

Which distances to calculate

form

Should the distances be returned as a list of matrices or condensed into a single data frame?

Value

A matrix showing pairwise distances for all potential matches of treatment and control units under various distance metrics


reaganmozer/textmatch documentation built on March 7, 2024, 2:41 p.m.