Description Usage Arguments Value Examples
search_for
is used to search for similar or exact sentences in a PDF.
1 | search_for(x, sen, exact = FALSE, cos_sim = 0.5)
|
x |
File name/path of the PDF. |
sen |
Sentence to be used to search in the text. |
exact |
If you search for the exact sentence, the default is FALSE and the cosine distance is used as similarity measurement. |
cos_sim |
Similarity parameter of the cosine distance. The output contains sentences which have cosine similarity greater or equal 'cos_sim'. The default is 0.5. |
A tibble data frame that contains the measured cosine similarity and the location of the match, the page number and the sentence number.
1 2 3 4 5 6 7 8 9 10 11 12 13 | # PDF from Book Reports,
# URL: https://www.bookreports.info/hansel-and-gretel-summary/
file <- system.file('pdf', 'summary_hansel_and_gretel.pdf', package = 'antiplugr')
# a similar sentence from 'grimm_hanse_and_gretel.pdf' from Short Story America,
# URL: http://www.shortstoryamerica.com/pdf_classics/grimm_hanse_and_gretel.pdf
sen_1 <- "When four weeks had passed and Hansel was still thin, impatience overcame her, and she would wait no longer."
# an exact sentence
sen_2 <- "When four weeks had passed and Hansel was still thin, the witch got tired."
search_for(file, sen_1)
search_for(file, sen_2, exact = TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.