clean_name | R Documentation |
clean_name
clean_name(x, terms = NULL, collapse = NULL, verbose = FALSE)
x |
a vector of names to clean. This will be coerced to class character internally |
terms |
a character vector of terms to remove from elements of x. Terms are only removed as whole words, rather than if they also happen to occur as strings within elements of x |
collapse |
a character vector of strings which should collapsed (i.e. replaced by "", rather than the default " "). If one of the collapse terms is a special regex character, it will need to be escaped, e.g. "\-" |
verbose |
A logical of length 1 determining if function progress should be reported to the console |
Function which bundles a series of cleaning routines into a single process. First any words in brackets are removed, followed by a series of user-defined terms if given. Next Roman and Arabic numerical are removed, then abbreviations up to five letters (abbreviations are matched by the following dot e.g ABFS.). By default, characters for removal are replaced by a white space to prevent accidental collapse of strings. However, there may be specific cases where a collapse is required and so terms given in collapse are dealt with next. After collapsing, rogue all rogue punctation is removed, then isolated lowercase letters, then isolated groups of capitals up to 5 characters long. Finally, white spaces greater than 1 are removed, along with trailing white space, any remaining strings longer than 2 words subsetted to the first word, the first letter of each string capitalised and zero length strings converted to NA
a character vector the same length as x. Elements which were reduced to zero characters during cleaning are returned as NA
# load dataset
data("brachios")
# clean genus names
gen_clean <- clean_name(brachios$genus)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.