ropensci/textreuse: Detect Text Reuse and Document Similarity

Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

README.md

Vignettes Man pages API and functions Files

Package details
Maintainer
License	MIT
Version	0.1.5
URL	https://docs.ropensci.org/textreuse (website) https://github.com/ropensci/textreuse
Package repository	View on GitHub
Installation	Install the latest version of this package by entering the following in R: `install.packages("remotes") remotes::install_github("ropensci/textreuse")`