lapply: Method for Iterating over a ResultIterator

lapplyR Documentation

Method for Iterating over a ResultIterator

Description

This function allows us to loop over the results in the OCR result iterator and query each result using a given function. We can get the confidence level for the predicted term (word, symbol, etc.), the rectangular bounding box for the term, possible alternatives for the term.

Usage

lapply(X, FUN, ...) # level = "word",

Arguments

X

an object of class TesseractBaseAPI-class obtained via call to tesseract.

FUN

the function to be applied to each successive result in the ResultIterator. This can also the address of a compatible C routine. This can be significantly faster. The C routine is called with the ResultIterator object and the level at which the OCR is working (word, symbol, etc.). It needs to return an R object, i.e. a SEXP. The C routine can be specified by name and is resolved by a call to getNativeSymbolInfo

level

either a number or character vector that maps to a PageIteratorLevel object. This is an enumerated constant mapping the concept of a term to a number to be used in the C++ code.

...

additional arguments

Value

A list with an element corresponding to each result in the OCR iterator.

Author(s)

Duncan Temple Lang

References

Tesseract https://code.google.com/p/tesseract-ocr/

See Also

tesseract

Examples


 f = system.file("images", "OCRSample2.png", package = "Rtesseract")
 api = tesseract(f)
 Recognize(api)

 alt = lapply(api, GetAlternatives, "symbol")


 sym = getNativeSymbolInfo("r_getConfidence")
 conf = lapply(api, sym$address, "word")

duncantl/Rtesseract documentation built on March 25, 2022, 5:50 a.m.