Description Usage Arguments Value Author(s) Examples
Reads a PDF and converts it's content to a Journal Article Tag Suite (JATS) xml file.
1 | cermine(path, outputs, exts, override, timeout, configuration)
|
path |
path to a directory containing PDF files. |
outputs |
(optional) list of extraction output(s); possible values: "jats" (document metadata and content in NLM JATS format), "text" (raw document text), "zones" (text zones with their labels), "trueviz" (geometric structure in TrueViz format), "images" (images from the document); default: "jats,images". |
exts |
(optional) a comma-separated list of extensions of the resulting files; the list has to have the same length as output list; default: "cermxml,images". |
override |
(optional) Boolean whether to override previous created files or not. Default: FALSE |
timeout |
(optional) approximate maximum allowed processing time for a PDF file in seconds; by default, no timeout is used; the value is approximate because in some cases, the program might be allowed to slightly exceeded this time, say by a second or two. |
configuration |
(optional) path to configuration properties file see https://github.com/CeON/CERMINE for description of available configuration properties. |
A vector containing the file reference to the JATS xml file.
Jason Mumbulla, jasonmumbulla@gmail.com
1 2 3 4 5 6 7 8 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.