hocr_from_zip: Run tesseract OCR on a zipped set of images.

View source: R/ocr-utils.R

hocr_from_zipR Documentation

Run tesseract OCR on a zipped set of images.

Description

This function unzips images and performs a system call to the tesseract command line tool, running tesseract-OCR on the given images.

Usage

hocr_from_zip(
  zipped,
  outputdir = ".",
  exdir = NULL,
  silent = FALSE,
  options = ""
)

Arguments

zipped

path to zip file containing images.

outputdir

directory where to store the output hocr files.

exdir

directory where images are extracted to. Set to NULL to extract image in a temporary folder.

silent

whether or not to supress messages (default is FALSE).

options

additional options to pass to the tesseract command line tool. E.g. options="–psm 1" will use page segmentation mode one.

Value

List of output hocr file paths.


OlivierBinette/TessTools documentation built on March 13, 2024, 7:33 p.m.