textreuse: Detect Text Reuse and Document Similarity
Version 0.1.4

Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

Package details

AuthorLincoln Mullen [aut, cre]
Date of publication2016-11-28 16:54:10
MaintainerLincoln Mullen <lincoln@lincolnmullen.com>
LicenseMIT + file LICENSE
URL https://github.com/ropensci/textreuse
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:

Try the textreuse package in your browser

Any scripts or data that you put into this service are public.

textreuse documentation built on May 30, 2017, 3:32 a.m.