convPDFs: Convert PDF to Text Files

Description Usage Arguments Value Author(s) Examples

Description

Use Xpdf to convert PDFs to text files. The function relies on doParallel to convert the PDFs in parallel. It calls Xpdf to complete the conversion. Currently this function only supports Windows.

Usage

1
2
convPDFs(mode = 1L, silent = FALSE, converter = c("pdftotext",
  "pdftohtml"), converter.path = NULL, use.parallel = TRUE)

Arguments

mode

integer, Default 1L.

1L

as is (-raw)

2L

with layout (-layout)

3L

no breaks (-nopgbrk)

4L

without format ()

5L

as table (-table)

6L

simplified (-simple)

silent

logical, whether show the results of the conversion. Default FALSE.

converter

which executable program to use to extract the pdf, "pdftotext" or "pdftohtml", default "pdftotext".

converter.path

Path to the pdftotext.exe or pdftohtml.exe executable program. If NULL, make sure the xpdf folder is under R.home() or the parent directory of R.home(). E.g., for an x86 Win PC, paste0(R.home(), paste0("/xpdfbin-win-4.00/bin32/pdftotext.exe")).

use.parallel

Logical, whether to apply parallel computation when there are more than 50 files to process. Default TRUE.

Value

Full name of the PDFs if you set silent=FALSE.

Author(s)

Yiying Wang, wangy@aetna.com

Examples

1
2
3
4
## Not run: 
convPDFs()

## End(Not run)

madlogos/aseshms documentation built on May 21, 2019, 11:03 a.m.