tesseractFuns | R Documentation |
These functions provide ways to both set and query the state of the tesseract object.
ReadConfigFile(api, files, debug = FALSE, ok = FALSE)
SetImage(api, pix)
SetInputName(api, name, check = TRUE, load = TRUE)
GetInputName(api)
GetImage(api, asArray = FALSE)
SetPageSegMode(api, mode)
GetPageSegMode(api)
SetRectangle(api, ..., dims = sapply(list(...), as.integer))
SetSourceResolution(api, ppi)
GetSourceYResolution(api)
SetOutputName(api, filename)
ProcessPages(filename, api = tesseract(), timeout = 0L, out = tempfile())
api |
the instance of the |
pix |
an object of class |
asArray |
a logical value. If |
ppi |
the per-pixel resolution as an integer. |
dims |
a vector of length 4 giving the location of the rectangle
as x1, y1, width, height. This should NOT be the coordinates
of the top-left and bottom-right of the rectangle, i.e. |
... |
the left, top, width and height |
files |
a character vector specifying the full or relative paths to the configuration files. |
name |
the name of the file being processed by the OCR system. |
ok |
in |
mode |
the value for the page segmentation mode for the tesseract
instance. This must correspond to one of the values in
|
check |
check to see if the file actually exists |
load |
load the image in the file name and set it as the current image in the tesseract object. |
filename , out |
the name of the file.
This is the name of the image file to process, or the output
file to which the results of the OCR will be directed, if that
occurs. The latter is rarely of interest as we can
get this information directly from the |
timeout |
this is almost always 0 and so not specified. It controls how long any particular step in the processing should be allow to take before terminating the entire process. |
Duncan Temple Lang
tesseract
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.