Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.
Package details |
|
|---|---|
| Maintainer | |
| License | MIT |
| Version | 0.1.5 |
| URL | https://docs.ropensci.org/textreuse (website) https://github.com/ropensci/textreuse |
| Package repository | View on GitHub |
| Installation |
Install the latest version of this package by entering the following in R:
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.