| dot-make_unicode_block_regex | Make Regex for Unicode Blocks |
| dot-space_regex_selector | Space Text by a Regex Selector |
| piecemaker-package | piecemaker: Tools for Preparing Text for Tokenizers |
| prepare_and_tokenize | Split Text on Spaces |
| prepare_text | Prepare Text for Tokenization |
| remove_control_characters | Remove Non-Character Characters |
| remove_diacritics | Remove Diacritical Marks on Characters |
| remove_replacement_characters | Remove the Unicode Replacement Character |
| space_cjk | Add Spaces Around CJK Ideographs |
| space_punctuation | Add Spaces Around Punctuation |
| squish_whitespace | Remove Extra Whitespace |
| tokenize_space | Break Text at Spaces |
| validate_utf8 | Clean Up Text to UTF-8 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.