This provides a flexible Optical Character Recognition (OCR) facility via the tesseract C++ library. This allows us to read text from images. It also allows us to analyze the results and possible errors in the recognition. We can do data analysis on the errors, if we know the truth, and explore how we may improve the recognition. It also provides some functionality from the leptonica library for performing image processing. This allows us, for example, to detect lines in an image, important for interpreting tables.
Package details |
|
---|---|
Author | Duncan Temple Lang, Matt Espe |
Maintainer | Duncan Temple Lang <duncan@r-project.org> |
License | Apache License |
Version | 0.6-0 |
Package repository | View on GitHub |
Installation |
Install the latest version of this package by entering the following in R:
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.