knitr::opts_chunk$set(cache = FALSE, warning = FALSE, error = FALSE, fig.path = "", eval=FALSE)
pdftables package allows the user to convert PDF tables to formats more
amenable to analysis (csv, XLM, or XLSX) by wrapping the
The package can be installed from either CRAN or Github (development version):
# From CRAN install.packages("pdftables") # From Github library(devtools) install_github("expersso/pdftables") library(pdftables)
To use the package the user first needs to sign up to the PDFTables API to get an API token (they offer a free package that allows up to 50 pages).
In the following example we first write the
iris dataset to a
.csv file. We
then open that file and print it as a
function we then upload that PDF to the PDFTables API
which parses and returns the converted file as
(Note: All functions in the package require that you provide your api key in the
api_key argument. By default this looks for an environment variable called
pdftable_api, but you can also provide it directly.)
write.csv(head(iris, 20), file = "test.csv", row.names = FALSE) # Open test.csv and print as PDF to "test.pdf" convert_pdf("test.pdf", "test2.csv") # Converted test.pdf to test2.csv
output_file argument is omitted, the name of the output file will be
the same as the input file, but with the right file extension.
The package (and API) supports converting PDFs to
Note that the conversion sometimes fails to recover the underlying data exactly, so you may have to open the retrieved file and make some manual corrections.
get_remaining function shows you how many pages you have remaining.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.