knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

ocrquality

An R package for measuring OCR quality

Author: Lincoln Mullen
License: MIT
Status: In development

CRAN_Status_Badge CRAN_Downloads Travis-CI Build Status AppVeyor Build Status

Description

Measuring OCR rigorously is probably more effort than it is worth, if it can even be done properly. But sometimes you have a corpus, perhaps one for which you have done the OCR yourself, and need to check the reliability of the OCR to make sure that the texts are about the same quality. That's what this package is for. It provides a few quick-and-dirty methods of estimating the quality of OCR. These estimates do not rely on any ground truth, so they are not an absolute measure of the quality of the texts. But they do provide a relative measure within the corpus, so that you can detect texts which are significantly worse than others.



lmullen/ocrquality documentation built on May 21, 2019, 7:35 a.m.