| batch_get_gold | Extract text from a set of PDFs with embedded text. |
| batch_get_ngrams | Get a n-grams from one or more texts |
| batch_simulate_degrade_set | Simulate degraded PDFs from a set of input PDFs |
| char_ngrams | Return a df with counts of all characters in df |
| check_embed | Check if text embed is not from OCR |
| create_dirs | Create directories for 'ocrerrs' |
| degrade_blur | Degrade PDF quality by simulating blurred text |
| degrade_complex | Degrade PDF quality by combining degradation parameters |
| degrade_density | Degrade PDF quality by manipulating pixel density |
| degrade_fax | Degrade PDF quality by simulating a fax |
| degrade_pages | Wrap degrade functions of split PDF files |
| degrade_rotate | Degrade PDF quality by simulating page rotation |
| find_errors | Find errors from OCR by comparing to gold standard |
| find_min_dists | Find the minimum string edit for each bad word |
| get_bg_1grams | Get ngrams and counts for bad and gold strings |
| get_delta_words | Get words with difference frequencies between bad and gold... |
| get_dist_mat | Return a matrix of optimal string alignment distances for... |
| get_embed_pages | Return a vector of pages with embedded text |
| get_file_base | Return the base name of a file |
| get_gold | Extract text from a PDF with embedded text. |
| get_ngrams | Get a set of n-grams from text |
| get_POS | Return a table of parts of speech |
| hello | Hello, World! |
| hunspell_errors | Use hunspell to find errors |
| label_delta_words | Label words as correct or errors |
| make_gold_path | Create a path to which 'gold standard' results are written |
| normalize_text | Clean EOL characters from |
| ocr_pages | Wrap optical character recognition around a set of files |
| save_gold_text | Save the extracted text as a .rda |
| simulate_degrade_set | Simulate degraded PDFs from an input PDF |
| split_pdf | Split a PDF into multiple pages |
| summarize_gold | Summarize the text from a gold-standard PDF |
| tess_ocr | Perform optical character recognition with tesseract |
| write_gold_text | Write the extracted text to file |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.