dot-make_unicode_block_regex | Make Regex for Unicode Blocks |
dot-space_regex_selector | Space Text by a Regex Selector |
piecemaker-package | piecemaker: Tools for Preparing Text for Tokenizers |
prepare_and_tokenize | Split Text on Spaces |
prepare_text | Prepare Text for Tokenization |
remove_control_characters | Remove Non-Character Characters |
remove_diacritics | Remove Diacritical Marks on Characters |
remove_replacement_characters | Remove the Unicode Replacement Character |
space_cjk | Add Spaces Around CJK Ideographs |
space_punctuation | Add Spaces Around Punctuation |
squish_whitespace | Remove Extra Whitespace |
tokenize_space | Break Text at Spaces |
validate_utf8 | Clean Up Text to UTF-8 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.