prepName: Format Multiple People's Names
In LimaRAF/plantR: Managing Species Records from Biological Collections

prepName

R Documentation

Format Multiple People's Names

Description

Format or convert multiple collector's or identifier's names into a standardized name format.

Usage

prepName(
  x,
  fix.names = TRUE,
  output = "all",
  treat.prep = NULL,
  add.treat = TRUE,
  sep.in = c(";", "&", "|", " e ", " y ", " and ", " und ", " et "),
  sep.out = "|",
  special.char = FALSE,
  format = "last_init",
  pretty = TRUE,
  get.prep = FALSE,
  get.initials = TRUE
)

Arguments

`x`	the character string or vector containing the names.
`fix.names`	logical. Should the general notation of names be standardized? Default to TRUE.
`output`	a character string with the type of output desired: all names, first name, or auxiliary names.
`treat.prep`	a character or vector containing the treatment prepositions that should be removed from names. Default to some common prepositions in Portuguese, Spanish and English (see Details).
`add.treat`	logical. Should the treatment proposition(s) provided in 'treat.prep' be concatenated with plantR defaults or be used separately? Default to TRUE (concatenate prepositions).
`sep.in`	a vector of the symbols separating multiple names. Default to: ";", "&", "\|", " e ", " y ", " and ", " und ", and " et ".
`sep.out`	a character string with the symbol separating multiple names in the output string. Defaults to "\|". If a character vector of length 2 or more is supplied, the first element is used with a warning.
`special.char`	logical. Should special characters be maintained? Default to FALSE.
`format`	character. Output name format. The default is "last_init".
`pretty`	logical. Should the output name be returned in a pretty presentation (i.e. only the first letter of names capitalized, initials separated by points and no spaces, and family name prepositions in lower cases). Default to TRUE. If FALSE, names are returned in the same way as the input object `x`.
`get.prep`	logical. Should last name prepositions be included? Default to FALSE.
`get.initials`	logical. Should the first name(s) be abbreviated? Default to TRUE.

Details

The default name format is the one suggested by the TDWG is: Last name, followed by a comma and then the initials, separated by points (e.g. Hatschbach, G.G.). By default, the names of multiple people associated to each record are separated by a pipe (i.e. '|'). But this default can be altered using the argument sep.out.

In the case of names from more then one person (separated by the characters defined in the argument sep.in, the argument output controls which names should be returned: names of all person ("all", the default), first person's names ("first") or all but the first person's names ("aux").

The function identifies (and removes) name prefixes or prepositions (e.g. de, dos, van, ter, ...). Also, it removes some titles (i.e. Dr., Dra., Pe., Sr., Mrs.), but not all of them (e.g. Doctor, Priest, Mister, etc.). Users can use the plantR default list of treatment prepositions (argument 'treat.prep' = NULL; the default), their own list of prepositions or a combination of both (argument 'add.treat' = TRUE; the default). To inspect the plantR default list of treatment prepositions please check the internal object 'treatPreps'.

The function also does not handle hyphenated first names. If only one name is given, the function returns x with the first letter capitalized.

The function has the option of standardizing the general notation of names and the general format of names. These standardizations are controlled by the arguments fix.names and prep.tdwg, which call internally the plantR functions fixName() and prepTDWG().

Value

The character string x in a standardized name format.

Author(s)

Renato A. F. de Lima & Hans ter Steege

References

Conn, Barry J. (ed.) (1996). HISPID 3 - Herbarium Information Standards and Protocols for Interchange of Data. Herbarium Information Systems Committee' (HISCOM). https://www.tdwg.org/standards/hispid3/

Willemse, L.P., van Welzen, P.C. & Mols, J.B. (2008). Standardisation in data-entry across databases: Avoiding Babylonian confusion. Taxon 57(2): 343-345.

Examples


  # Simple names
  prepName("Alwyn H. Gentry")
  prepName("Karl Emrich & Balduino Rambo")
  prepName("R. Reitz; R.M. Klein")
  prepName("Reitz, Raulino et R.M. Klein", sep.out = " & ")

  # Name with prepositions and compound last names
  prepName("Carl F. P. von Martius; Augustin Saint-hilaire")
  prepName("Carl von Martius; Auguste de Saint-Hilaire", get.prep = TRUE)
  prepName("A. Ducke; Dárdano de Andrade-Lima")
  prepName("Ducke, A. ; Dárdano de Andrade-Lima")

  # Names with generational suffixes
  prepName("HF Leitão Filho; GJ Shepherd")

  # Names with titles
  prepName("Pe. Raulino Reitz")
  prepName("Dra. Gloria Galeano")
  prepName("Prof. Hermogenes de Freitas Leitao Filho")
  prepName("Sir G.T. Prance")
  prepName("Sir G.T. Prance", treat.prep = "Sir")

  # Other name formats
  prepName("[D. Hugh-Jones]")
  prepName("L. McDade & J. O'Brien")

  # Multiple names separated by different characters
  prepName("A. Alvarez; A. Zamora & V. Huaraca")
  prepName("A. Alvarez; A. Zamora & V. Huaraca", out = "first")
  prepName("A. Alvarez; A. Zamora & V. Huaraca", out = "aux")

  # Multiple names separated by commas
  prepName("A. Alvarez, A. Zamora & V. Huaraca") # bad output incorrect
  prepName("A. Alvarez, A. Zamora & V. Huaraca", sep.in=c(",","&")) # output correct

  # Multiple (last + first) names separated by commas
  prepName("Alvarez, A., Zamora, A. & Huaraca, V.", sep.in=c(",","&"))  # output incorrect
  prepName("Alvarez, A., Zamora, A. & Huaraca, V.", sep.in=c(".,","&")) # output correct

LimaRAF/plantR documentation built on Jan. 1, 2023, 10:18 a.m.