prepTDWG | R Documentation |
Convert people's names to different formats (i.e. Last Name, First Names(s) or First Name(s), Last Name) with or without last name prepositions. The default name format is the one suggested in the Biodiversity Information Standards (TDWG) format.
prepTDWG( x, sep = ", ", format = "last_init", pretty = TRUE, get.prep = FALSE, get.initials = TRUE, max.initials = 4 )
x |
the character string or vector containing the names. |
sep |
character. Input and output name separator. Default to ", ". |
format |
character. Output name format. The default is "last_init". |
pretty |
logical. Should the output name be returned in a pretty
presentation (i.e. only the first letter of names capitalized, initials
separated by points and no spaces, and family name prepositions in lower
cases). Default to TRUE. If FALSE, names are returned in the same way as
the input object |
get.prep |
logical. Should last name prepositions be included? Default to FALSE. |
get.initials |
logical. Should the first name(s) be abbreviated? Default to TRUE. |
max.initials |
numerical. Upper limit of number of letter for a single word to be considered as initials and not as a first name. Default to 4. |
The default name format follows the one suggested by the TDWG, which is: Last name, followed by a comma and then the initials, separated by points (e.g. Hatschbach, G.G.).
The functions uses internally another plantR function: lastName()
.
So, it assumes that people last names are the ones provided at the end of
the name string or preceding the name separator (i.e. comma), if present.
The function deals with simples last names, as well as with compound last
names and last names with common name prefixes or prepositions (e.g. de,
dos, van, ter, ...). By default, these prefixes and prepositions are
removed, but they can be returned if the argument get.prep
is set to
TRUE.
The function assumes that all names containing separators (default to a comma) are in the format suggested by TDWG. But even for those cases, the function fixes simple problems (e.g. missing points between name initials).
If only one name is given, the function return the same name with the first letter capitalized.
The function output it is relatively stable regarding the input format, lower/uppercasing and spacing. But if the name provided has unusual formatting or if names for multiple people are provided within the same string, the function may not work properly. So, the output may depend on the input format and some level of double-checking may be necessary. See examples below.
The character string x
in the standardized format.
Renato A. F. de Lima
Conn, Barry J. (ed.) (1996). HISPID 3 - Herbarium Information Standards and Protocols for Interchange of Data. Herbarium Information Systems Committee' (HISCOM). https://www.tdwg.org/standards/hispid3/
Willemse, L.P., van Welzen, P.C. & Mols, J.B. (2008). Standardisation in data-entry across databases: Avoiding Babylonian confusion. Taxon 57(2): 343-345.
lastName, getPrep and getInit.
# Single names prepTDWG("gentry") prepTDWG("GENTRY") # Simple names prepTDWG("Alwyn Howard Gentry") prepTDWG("Alwyn H. Gentry") prepTDWG("A.H. Gentry") prepTDWG("A H Gentry") prepTDWG("Gentry, Alwyn Howard") prepTDWG("Gentry, AH") prepTDWG("Gentry AH") prepTDWG("GENTRY, A H") prepTDWG("gentry, alwyn howard") prepTDWG("gentry, a.h.") prepTDWG("gentry, a. h.") # Name with prepositions prepTDWG("Carl F. P. von Martius") prepTDWG("Carl F. P. von Martius", get.prep = TRUE) # Names with generational suffixes prepTDWG("Hermogenes de Freitas Leitao Filho") prepTDWG("H.F. Leitao Filho") prepTDWG("Leitao Filho, HF") prepTDWG("Leitao filho, H. F.") # Compound last name prepTDWG("Augustin Saint-Hilaire") prepTDWG("A. Saint-Hilaire") prepTDWG("Saint-Hilaire, Augustin") # Other formats prepTDWG("John MacDonald") prepTDWG("John McDonald") prepTDWG("John O'Brien") # Multiple names, different settings names <- c("Gentry, AH", "Gentry A.H.", "Carl F. P. von Martius","Leitao filho, H. de F.", "Auguste de Saint-Hilaire", "John O'Reilly") prepTDWG(names) prepTDWG(names, format = "init_last") prepTDWG(names, format = "init_last", get.prep = TRUE) prepTDWG(names, get.prep = TRUE, format = "prep_last_init") prepTDWG(names, get.prep = TRUE, format = "prep_last_init", get.initials = FALSE) prepTDWG(names, get.prep = TRUE, pretty = FALSE, get.initials = FALSE) ## Unusual formatting (function won't work always...) # two or more people names: output incorrect (combine names of authors) prepTDWG("C. Mendonca Filho; F. da Silva") # two or more names, separated by comma: output incorrect (combine names of authors) prepTDWG("A. Alvarez, A. Zamora & V. Huaraca") # one name, two commas: fails to get all names prepTDWG("Cesar Sandro, Esteves, F") #' one name, abbreviations in the start and end: fails to get all names prepTDWG("C.S. Esteves F.")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.