Extract text from one to many pdf documents into a tm Corpus or Vcorpus.
ft_extract_corpus(paths, which = "xpdf", ...)
Path to one or more pdfs
One of gs or xpdf.
further args passed on to readerControl parameter
A tm Corpus (or VCorpus, later that is)
## Not run:
path <- system.file("examples", "example1.pdf", package = "fulltext")
(res <- ft_extract_corpus(path, "xpdf"))
(res_gs <- ft_extract_corpus(path, "gs"))
## End(Not run)
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.
Questions? Problems? Suggestions? Tweet to @rdrrHQian@mutexlabs.com.
All documentation is copyright its authors; we didn't write any of that.