cleanup_bw: cleaning an image prior to Tesseract scan

cleanup_bwR Documentation

cleaning an image prior to Tesseract scan

Description

A Tesseract scan often gives better results when the image is transformed. The cleanup_bw function transforms the image to black-and-white and allows some other transformations (see Details)

Usage

cleanup_bw(img, cln_options = list())

Arguments

img

An image object returned by image_read() or image_graph()

cln_options

A list with options for magick transformation function. Default: list(). See Details for the transformations that can be applied.

Value

An image object with the applied transformations

Details

The following list describe the transformations and the order in which they are done. Also is indicated if they are always done or only when specified in the cln_options list:

  • magick::image_trim() removes edges that are the background color from the image. Only when ⁠trim=⁠ is in the list

  • magick::image_resize() magick documentation : resizes using custom filterType (??). Only when ⁠resize=⁠ is in the list. (I use it as resize= "4000x")

  • magick::image_modulate() adjusts brightness, saturation and hue of image relative to current. The transformation is always done with the value 100 for each of the brightness, saturation and hue arguments unless overwritten by a equally named entry in the cln_options list (e.g. brightness=120)

  • magick::image_contrast() enhances intensity differences in image. Only when ⁠sharpen=⁠ is in the list (e.g. sharpen=1)

  • magick::image_quantize() reduces number of unique colors in the image. Is always done with argument colorspace='gray'

  • magick::image_transparent() and magick::image_background() are always done for the 'colors' white and black to make the image black-and-white

  • magick::image_enhance() tries to minimize noise. Done only when enhance=TRUE in the list.

See Also

scanner_functions,scan_with_hocr() and extract_table()

Examples

## Not run: 
cln_options1 = list(resize="4000x",
                  trim=10,enhance=TRUE,sharpen=1)
img2 = clean_up (img1,cln_options1)

## End(Not run)

HanOostdijk/HOQCutil documentation built on July 28, 2023, 5:56 p.m.