jstor_ocr: Take metadata and ocr text from DfR and merge it into xml...

View source: R/jstor_ocr.r

jstor_ocrR Documentation

Take metadata and ocr text from DfR and merge it into xml files.

Description

Take metadata and ocr text from DfR and merge it into xml files.

Usage

jstor_ocr(a1 = input_path, b1 = output_path)

Arguments

a1

The path to the unzipped Jstor folder.

b1

The path to a folder that will contain the output files.

Value

Produces xml files with concatenated metadata + machine readable text, and a quality control message indicating similitude between indexed file names. Note well: Due to inconsistent text tagging from file sources originating from JSTOR, may throw error messages when processing. At point of issuance, this code originates from a problem in education research practice setting.

Examples

jstor_ocr(a1 = "path/to/d4r/files", b1 = "destination/folder/path")

cownr10r/r7283 documentation built on Sept. 29, 2022, 9:39 a.m.