Description Usage Arguments Value Examples
Compares a document to another document to find similar sentences. The cosine similarity is used to compare both documents.
1 |
x |
File name/path of the PDF. |
source |
File name/path of the source which should be compared to the document x (source has to be in PDF format). |
cos_sim |
Similarity parameter of the cosine distance. The output contains sentences which have cosine similarity greater or equal 'cos_sim'. The default is 0.6. |
A tibble data frame that contains the measured cosine similarity, the similar sentence of the document x and the location of the match, from both documents the page number and the sentence number.
1 2 3 4 5 6 7 8 9 | # PDF from Book Reports,
# URL: https://www.bookreports.info/hansel-and-gretel-summary/ a bit modified.
file1 <- system.file('pdf', 'summary_hansel_and_gretel.pdf', package = 'antiplugr')
# PDF from Short Story America,
# URL: http://www.shortstoryamerica.com/pdf_classics/grimm_hanse_and_gretel.pdf
file2 <- system.file('pdf', 'grimm_hanse_and_gretel.pdf', package = 'antiplugr')
compare(file1, file2)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.