standardize.countrynames: Standardize Country Names

Description Usage Arguments Details Value Examples

View source: R/standardize.countrynames.R

Description

Takes in a dataframe or vector containing a column of country names and returns the data structure with the names standardized.

Usage

1
standardize.countrynames(input, input.column = NULL, standard = "default", standard.column = NULL, only.names = FALSE, na.rm = FALSE, suggest = "prompt", print.changes = TRUE, verbose = FALSE)

Arguments

input

A dataframe or vector containing a column of country names

input.column

The column containing country names if input is a dataframe, identified by name or number; ignored if input a vector

standard

The name of an included name set (see details), or a dataframe or vector containing a column of standard names

standard.column

The column containing standard names if standard is a dataframe, identified by name or number; ignored if standard a vector or an included name set

only.names

Only return a vector of standardized names

na.rm

Remove any countries not contained in the standard set

suggest

Suggestions for inexact matches; "prompt" allows user to select desired suggestions, "auto" applies all, "none" applies none

print.changes

Print which names changed

verbose

Print full output, including names of nonidentified countries

Details

Included name sets "default": Naming convention based on the ISO "imf": International Monetary Fund names "iso": International Standards Organization names "pwt": Penn World Tables names "wb": World Bank names "who: World Health Organization names

Value

If input a dataframe, returns the identical dataframe with the country names column standardized; if input a vector of country names, returns the standardized vector

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
library(StandardizeText)
sample.names <- c("Aland Is.","Brunei Daru.","Ivory Coast","The Gambia")
sample.std <- c("brunei","aland is","gambia, the","cote divoire")
sample.df <- data.frame(foo=2:5,bar=sample.names, baz=7:4, qux=sample.std)

#Standardize vector using iso names
out.a <- standardize.countrynames(sample.names,standard="iso",suggest="auto")
#Standardize vector using provided names
out.b <- standardize.countrynames(sample.names,standard=sample.std,suggest="auto")
#Standardize dataframe using wb names
out.c <- standardize.countrynames(sample.df,2,standard="wb",suggest="auto",verbose=TRUE)
#Standardize dataframe using provided names without suggestions
out.d <- standardize.countrynames(sample.df,"bar",sample.df,"qux",suggest="none",verbose=TRUE)

Example output

The following names were changed:
     Original      Modified
1   Aland Is. Aland Islands
2 Ivory Coast Cote D'Ivoire
3  The Gambia        Gambia

The following suggested changes were applied:
      Original         Suggested
1 Brunei Daru. Brunei Darussalam

The following names were changed:
     Original     Modified
1   Aland Is.     aland is
2 Ivory Coast cote divoire
3  The Gambia  gambia, the

The following suggested changes were applied:
      Original Suggested
1 Brunei Daru.    brunei

The following names were not in the standard set and left unchanged:
[1] "Aland Is."

The following names were changed:
     Original      Modified
1 Ivory Coast Cote d'Ivoire
2  The Gambia   Gambia, The

The following suggested changes were applied:
      Original         Suggested
1 Brunei Daru. Brunei Darussalam

The following names were not recoginized and left unchanged:
[1] "Brunei Daru."

The following names were changed:
     Original     Modified
1   Aland Is.     aland is
2 Ivory Coast cote divoire
3  The Gambia  gambia, the

StandardizeText documentation built on May 1, 2019, 9:31 p.m.