Standardize Country Names

Description

Takes in a dataframe or vector containing a column of country names and returns the data structure with the names standardized.

Usage

1
standardize.countrynames(input, input.column = NULL, standard = "default", standard.column = NULL, only.names = FALSE, na.rm = FALSE, suggest = "prompt", print.changes = TRUE, verbose = FALSE)

Arguments

input

A dataframe or vector containing a column of country names

input.column

The column containing country names if input is a dataframe, identified by name or number; ignored if input a vector

standard

The name of an included name set (see details), or a dataframe or vector containing a column of standard names

standard.column

The column containing standard names if standard is a dataframe, identified by name or number; ignored if standard a vector or an included name set

only.names

Only return a vector of standardized names

na.rm

Remove any countries not contained in the standard set

suggest

Suggestions for inexact matches; "prompt" allows user to select desired suggestions, "auto" applies all, "none" applies none

print.changes

Print which names changed

verbose

Print full output, including names of nonidentified countries

Details

Included name sets "default": Naming convention based on the ISO "imf": International Monetary Fund names "iso": International Standards Organization names "pwt": Penn World Tables names "wb": World Bank names "who: World Health Organization names

Value

If input a dataframe, returns the identical dataframe with the country names column standardized; if input a vector of country names, returns the standardized vector

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
library(StandardizeText)
sample.names <- c("Aland Is.","Brunei Daru.","Ivory Coast","The Gambia")
sample.std <- c("brunei","aland is","gambia, the","cote divoire")
sample.df <- data.frame(foo=2:5,bar=sample.names, baz=7:4, qux=sample.std)

#Standardize vector using iso names
out.a <- standardize.countrynames(sample.names,standard="iso",suggest="auto")
#Standardize vector using provided names
out.b <- standardize.countrynames(sample.names,standard=sample.std,suggest="auto")
#Standardize dataframe using wb names
out.c <- standardize.countrynames(sample.df,2,standard="wb",suggest="auto",verbose=TRUE)
#Standardize dataframe using provided names without suggestions
out.d <- standardize.countrynames(sample.df,"bar",sample.df,"qux",suggest="none",verbose=TRUE)