normalise: Normalise hospital names

Description Usage Arguments Details Value Examples

View source: R/normalise.R

Description

normalise tries to match provided hospital names to the Portuguese NHS hospitals, i.e. to those hospitals included in the data set hospitals, thus allowing conversion to standard hospital names. By default, it returns the shortened version of the hospital name: column hospital_short_name in hospitals. Use the return argument to return a different variable, see below for possible values.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
normalise(
  nm,
  return = c("hospital_short_name", "hospital_full_name", "hospital_id",
    "hospital_acronym"),
  unmatched_as_na = TRUE
)

normalize(
  nm,
  return = c("hospital_short_name", "hospital_full_name", "hospital_id",
    "hospital_acronym"),
  unmatched_as_na = TRUE
)

Arguments

nm

A character vector of hospital names.

return

A string indicating the hospital attribute to be returned: either hospital_short_name (default), hospital_full_name, hospital_id or hospital_acronym. These hospital variables are documented in hospitals.

unmatched_as_na

A logical indicating whether unmatched hospital names are returned as NA (TRUE, the default) or as originally supplied in nm (FALSE).

Details

The method behind normalise for matching hospital names is based on an heuristic that uses a minimal set of keywords to identify the hospital. This is implemented by using regular expressions. The regular expressions are provided in data set hospitals, column hospital_regex. Moreover, the method is case insensitive and is pretty tolerant to variations in the name as long as one of the critical keywords is found in the name. Note however that the regular expressions have been designed such that matches are mutually exclusive. So the same hospital name will never match more than one hospital of the data set hospitals.

normalise is aware of deprecated hospital names, and will map those old designations to the new hospital names, e.g., Hospital do Alto Ave is correctly mapped to Hospital da Senhora da Oliveira, Guimarães, EPE.

normalise is lenient with typos associated with accented characters, so, e.g., both expressions 'Hospital de São João' and 'Hospital de Sao Joao' will correctly match to the same hospital: CHU de São João.

Value

A character vector.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Match hospital with a single keyword
normalise('Matosinhos')

# The same, but return now the full name
normalise('Matosinhos', 'hospital_full_name')

# Get instead the hospital identifier
normalise('Matosinhos', 'hospital_id')

# Or even just the acronym (useful for labelling in plots)
normalise('Matosinhos', 'hospital_acronym')

# Find hospitals from their old names
# "Hospital do Alto Ave" is the old name for 'Hospital da Senhora da Oliveira, Guimarães, EPE'
normalise('Hospital do Alto Ave', 'hospital_full_name')

# `normalise()` is vectorised over `nm`
normalise(nm = c('medio tejo', 'oeste', 'guarda'))

hospitals documentation built on Nov. 26, 2021, 9:06 a.m.