| rt_read_pdf | R Documentation |
Takes a path to a PDF file and returns its text content as a single character string, extracted with the poppler 'pdftotext' utility (the same extractor the original 'oddpub' package relied on, implemented here as a standard system call). Different extractors format text differently; the detectors in this package were tuned to the layout 'pdftotext' produces. To analyze the result with the plain-text detectors, write it to a '.txt' file first (see Examples).
rt_read_pdf(filepath)
filepath |
The path to the PDF file as a string (must end in '.pdf'). |
A character string with the extracted text.
## Not run:
# Path to a PDF file.
pdf_path <- system.file(
"extdata", "PMID32171256-PMC7071725.pdf", package = "rtransparency"
)
# Extract the text, write it to a TXT file, then run the detectors.
article_txt <- rt_read_pdf(pdf_path)
writeLines(article_txt, "article.txt")
rt_coi("article.txt")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.