Description Usage Arguments Value Examples
View source: R/ocr_dictionary.R
This function checks the quality of an OCR text against a dictionary. It will
return a number between 0
and 1
, which is the ratio of words
found in the dictionary to the total number of words in the document. The
higher the number, the better the quality of the OCR. These measures should
not be taken in an absolute sense. That is, a score of 1 does not indicate
perfect OCR. They should only be used to determine the relative quality of
OCR within a corpus of texts. You can pass a character vector of any length.
So, if you split a text into chunks, you can evaluate the OCR quality of each
chunk.
1 | ocr_dictionary(text, sample_size = -1L)
|
text |
A character vector. |
sample_size |
If this value is positive, then this many words from the
|
A vector of numeric values between 0
and 1
.
1 2 3 4 5 | paragraph <- "Fourr score and sleven years ago our fathers brought
forth on this continent, a new nation, conceived in Liberty,
and dedicated to tlhe proposition that all men are created equal."
ocr_dictionary(paragraph)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.