merge_shards: Merge shards

View source: R/inspect_output.R

merge_shardsR Documentation

Merge shards

Description

Merges text files from Document AI output shards into a single text file corresponding to the parent document.

Usage

merge_shards(source_dir, dest_dir)

Arguments

source_dir

folder path for input files

dest_dir

folder path for output files

Details

The function works on .txt files generated from .json output files, not on .json files directly. It also presupposes that the .txt filenames have the same name stems as the .json files from which they were extracted. For the v1 API, this means files ending with "-0.txt", "-1.txt", "-2.txt", and so forth. For the v1beta2 API, it means files ending with "-page-1-to-100.txt", "-page-101-to-200.txt", etc. The safest approach is to generate .txt files using text_from_dai_file() with the save_to_file parameter set to TRUE.

Value

no return value, called for side effects

Examples

## Not run: 
merge_shards(source_dir = getwd(), dest_dir = tempdir())

## End(Not run)

daiR documentation built on Sept. 8, 2023, 5:43 p.m.