hocr_from_images: Runs tesseract-OCR on the given image files.

View source: R/ocr-utils.R

hocr_from_imagesR Documentation

Runs tesseract-OCR on the given image files.

Description

This function performs a system call to the tesseract command line tool, running tesseract-OCR on the given images.

Usage

hocr_from_images(imgfiles, outputdir=here(), silent=FALSE, options="")

Arguments

imgfiles

list of image file paths. Image format should be compatible with tesseract as configured on the user's system. JPEG and PNG files are generally compatible.

outputdir

directory where to store the output hocr files.

silent

whether or not to supress messages (default is FALSE).

options

additional options to pass to the tesseract command line tool. E.g. options="–psm 1" will use page segmentation mode one.

Value

List of output hocr file paths.


OlivierBinette/TessTools documentation built on March 13, 2024, 7:33 p.m.