multi_dice_coefficient_matching: Multiple N-Gram Lngth Dice Coefficient Document Matching

Description Usage Arguments Value

View source: R/multi_dice_coefficient_matching.R

Description

Calculate N-Gram wise Dice coefficients for different N-Gram Lengths.

Usage

1
2
multi_dice_coefficient_matching(document_1, document_2, ngram_sizes = c(1:50),
  remove_duplicates = TRUE)

Arguments

document_1

A vector of strings (one per line or one per sentence), or a list of vectors of tokens (one per line or one per sentence).

document_2

Same as document_1, will be used for comparison.

ngram_sizes

A numeric vector of N-Gram lengths for us in calculating Dice coefficients.

remove_duplicates

Logical indicating whether dublicate ngrams should be removed before matching. Defaults to TRUE.

Value

A data.frame with Dice coefficients based on different N-Gram lengths.


matthewjdenny/SpeedReader documentation built on March 25, 2020, 5:32 p.m.