scanner_functions | R Documentation |
cleanup_bw()
, scan_with_hocr()
and extract_table()
can be used to cleanup and scan (OCR) an image and extract a table into a data.frame format. The workflow would then be:
read an image from file with magick::image_read()
e.g.
img1 = magick::image_read('example1.png')
define the list with cleanup options
e.g.
cln_options1 = list(resize="4000x",trim=10,enhance=TRUE,sharpen=1)
use the cleanup_bw()
function with this list
e.g.
img2 = cleanup_bw (img1,cln_options1)
scan (OCR) the cleansed image with scan_with_hocr()
e.g.
df1 = scan_with_hocr(img2,add_header_cols=F)
indicate in the columns of df1
which fields belong to the table headers (or alternatively define a headers
list)
extract the table with the extract_table()
function
e.g.
df2= extract_table(df1, headers=NULL,lastline = Inf, desc_above=T)
or alternatively
df2= extract_table(df1, headers=headers,lastline = Inf, desc_above=T)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.