get_ids_by_type()
and get_versions_by_type()
.list_processor_types()
, create_processor()
, enable_processor()
, disable_processor()
, and delete_processor()
.get_text()
and get_tables()
, which replace text_from_dai_response()
, text_from_dai_file()
, tables_from_dai_response()
and tables_from_dai_file()
.get_entities()
and draw_entities()
.dai_tab_sync()
and dai_tab_async()
following Google's discontinuation of the v1beta2 endpoint on 31 January 2024.draw*()
functions for better consistency with other functions..R
files and regrouped the functions.cli
package.make_hocr
to convert DAI output to hOCR files, thereby facilitating the creation of searchable PDFs. build_token_df()
and build_block_df()
functions so they can take as input response objects from dai_sync()
in addition to json files from dai_async()
. build_token_df()
and build_block_df()
functions to include confidence scores in the dataframe, so as to enable filtering on confidence.draw_*
" functions so they work with response objects (from dai_sync()
and dai_sync_tab()
) as well as with json files from dai_async_tab()
.draw_*
" functions to allow customizing color and thickness of lines around bounding boxes.tables_from_dai_response()
so that it handles response objects from dai_sync()
with form parser processors.get_processors()
, get_processor_info()
, and get_processor_versions()
. proc_v
to dai_sync()
and dai_async
, allowing for specification of processor version.dai_notify()
and merge_shards()
text_from_dai_response()
and text_from_dai_file()
to allow saving the output straight to a text file. dai_status()
that caused an error when processing responses from the v1beta2 endpoint (dai_tab_async()
). draw_*
" functions to allow custom output filenames. draw_blocks_new()
) to inspect block bounding boxes after reprocessing.draw_blocks()
, draw_paragraphs()
, draw_lines()
, and draw_tokens()
functions. These functions no longer require supplying a pdf file, as they get the images from a base64-encoded string in the json file. dai_sync()
/dai_async()
functions now access the new v1 endpoint, which has foreign language support. However, the v1 endpoint currently does not support table extraction, so the old processing functions (which access the v1beta2 endpoint) are kept under the new names dai_sync_tab()
and dai_async_tab()
. I expect this to be a temporary solution until DAI's capabilities are consolidated in a single endpoint, at which stage the *_tab()
functions will be phased out.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.