general_clean_directory: Mutate operation(s) in Scottish post office general directory...

Description Usage Arguments Value Examples

View source: R/general.r

Description

Attempts to clean the provided Scottish post office general directory data.frame.

Usage

1
general_clean_directory(directory, progress = TRUE, verbose = FALSE)

Arguments

directory

A Scottish post office general directory in the form of a data.frame or other object that inherits from the data.frame class such as a tibble. Columns must at least include forename, surname, occupation and addresses.

progress

Whether progress should be shown (TRUE) or not (FALSE).

verbose

Whether the function should be executed silently (FALSE) or not (TRUE).

Value

A tibble; columns include at least forename, surname, occupation, address.trade.number, address.trade.body, address.house.number and address.house.body. "house" suffix in occupation column is move to addresses, occupation information is repatriated from addresses to occupation column; addresses is split into trade and house address columns; additional records are created for each extra trade address identified. Entries are further cleaned of optical character recognition (OCR) errors and subject to a number of standardisation operations.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
pages <- rep("71", 2L)
surnames <- c("ABOT", "ABRCROMBIE")
forenames <- c("Wm.", "Alex")
occupations <- c("Wine and spirit mercht - See Advertisement in Appendix.", "")
addresses = c(
  "1S20 Londn rd; ho. 13<J Queun sq",
  "Bkr; I2 Dixon Street, & 29 Auderstn Qu.; res 2G5 Argul st."
)
directory <- tibble::tibble(
  page = pages, surname = surnames, forename = forenames,
  occupation = occupations, addresses = addresses
)
general_clean_directory(directory, progress = TRUE, verbose = FALSE)

podcleaner documentation built on Jan. 12, 2022, 1:06 a.m.