splitremove | R Documentation |
This function removes characters from a string based on a character vector named remove. This function can be used to remove prefixes, suffixes, titles, etc. from a given character vector. The function splits the string by empty spaces, dots, commas, and parentheses first & then it removes the items that are in the remove vector.
splitremove(string, remove)
string |
character vector that contains the text to keep and to remove |
remove |
character vector that contains the characters to remove from the string |
the revised character vector
with the contents of
remove removed from the string
Irucka Embry
regex - r regexp - replace title and suffix in any part of string with nothing in large file (> 2 million rows) - Stack Overflow answered by Molx on Apr 16 2015. See https://stackoverflow.com/questions/29680131/r-regexp-replace-title-and-suffix-in-any-part-of-string-with-nothing-in-large.
# Example
install.load::load_package("iemisc", "data.table")
# create the list of items to remove from the text
remove <- c("mister", "sir", "mr", "madam", "mrs", "miss", "ms", "iv",
"iii", "ii", "jr", "sr", "md", "phd", "mba", "pe", "mrcp", "and", "&", "prof",
"professor", "esquire", "esq", "dr", "doctor")
names <- data.table(Named = c("Alfredy 'Chipp' Kahner IV",
"Denis G. Barnekdt III", "JERUEG, RICHARDS Z. MR.", "EDWARDST, HOWARDD K. JR."))
# first use split comma
names[, Corrected_Named := splitcomma(names$Named)]
names
names[, Corrected_Named := splitremove(names$Corrected_Named, remove)]
names
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.