compareWord: Compares OCR words to truth
In duncantl/Rtesseract: Interface to the tesseract OCR system

compareWord

R Documentation

Compares OCR words to truth

Description

If we have the true text, we can use compareWords and other functions to compare the OCR results to the truth and determine which symbols were matched incorrectly. The results can then be displayed on the image and the incorrect symblols identified.

compareWords process a collection of words; compareWordInfo processes a single word and the corresponding true/actual value for the word.

Usage

compareWords(ocr, truth)
compareWordInfo(ocr, truth)

Arguments

`ocr`	the words from the OCR classification
`truth`	the true words

Value

compareWords returns a data frame with a row for each symbol/character that was different between the OCR version and the truth. The data frame contains

`ocr`	the character recognized by the OCR system, incorrectly
`truth`	the true value of the character
`position`	the index in the word of the misclassified character/symbol
`wordIndex`	the index of the word in which the misclassification occured
`trueWord`	the value of the true word
`ocrWord`	the value of the word as recognized by the OCR system
`symbolIndex`	the index of the character/symbol in the entire set of symbols

Note

This function does not yet handle the case where the OCR and true words do not have the same length.

Author(s)

Duncan Temple Lang

References

Tesseract https://code.google.com/p/tesseract-ocr/

Examples

compareWords(c("Duncin", "Temple", "Lung"), c("Duncan", "Temple", "Lang"))

duncantl/Rtesseract documentation built on Sept. 8, 2024, 8:38 a.m.

duncantl/Rtesseract index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

duncantl/Rtesseract
Interface to the tesseract OCR system

compareWord: Compares OCR words to truth
In duncantl/Rtesseract: Interface to the tesseract OCR system

Compares OCR words to truth

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

Related to compareWord in duncantl/Rtesseract...

R Package Documentation

Browse R Packages

We want your feedback!

duncantl/Rtesseract Interface to the tesseract OCR system

compareWord: Compares OCR words to truth In duncantl/Rtesseract: Interface to the tesseract OCR system

Compares OCR words to truth

Description

Usage

Arguments

Value

Note

Author(s)

References

See Also

Examples

Related to compareWord in duncantl/Rtesseract...

R Package Documentation

Browse R Packages

We want your feedback!

duncantl/Rtesseract
Interface to the tesseract OCR system

compareWord: Compares OCR words to truth
In duncantl/Rtesseract: Interface to the tesseract OCR system