txt_clean_word2vec | R Documentation |
Standardise text by
Conversion of text from UTF-8 to ASCII
Keeping only alphanumeric characters: letters and numbers
Removing multiple spaces
Removing leading/trailing spaces
Performing lowercasing
txt_clean_word2vec(x, ascii = TRUE, alpha = TRUE, tolower = TRUE, trim = TRUE)
x |
a character vector in UTF-8 encoding |
ascii |
logical indicating to use |
alpha |
logical indicating to keep only alphanumeric characters. Defaults to TRUE. |
tolower |
logical indicating to lowercase |
trim |
logical indicating to trim leading/trailing white space. Defaults to TRUE. |
a character vector of the same length as x
which is standardised by converting the encoding to ascii, lowercasing and
keeping only alphanumeric elements
x <- c(" Just some.texts, ok?", "123.456 and\tsome MORE! ")
txt_clean_word2vec(x)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.