splchk_addr: Spell check a data frame of NYC addresses.

Description Usage Arguments Value Examples

View source: R/splchk_addr.R

Description

The splchk_addr function performs a spell check on a data frame of NYC addresses with a street name dictionary built from NYC Department of City Planning's (DCP) PAD (Property Address Directory) and SND (Street Name Dictionary) files.

Usage

1
2
splchk_addr(in_df, new_addr_col_name, addr_col_name,
    third_col_name, third_col_type)

Arguments

in_df

a data frame containing NYC addresses. Required.

new_addr_col_name

new_addr_col_name the name of output addresses column as string. Required.

addr_col_name

the name of the input addresses column as string. Required.

third_col_name

the name of either the borough code or zip code column as string. Required.

third_col_type

either "boro_code" or "zip_code" as string. Required.

Value

A data frame containing the input data frame plus the spell checked address column.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# create a data frame of addresses
ADDR <- c("1212 AMESTERDAM AVEN","253 BROADWY",
    "250 BREDFORD PORK BLVD W","30 LAFAYET AVE")
CITY <- c("NEW YORK","NEW YORK","BRONX","BROOKLYN")
STATE <- rep("NY",length(ADDR))
ZIP_CODE <- c("10027","10007","10468","11217")
u_id <- 1:length(ADDR)
df = data.frame(u_id, ADDR, CITY, STATE, ZIP_CODE)

#get version of DCP PAD used to build package data
rNYCclean::pad_version

#spell check address column using zip code
df1 <- splchk_addr(df,"ADDR.splchk","ADDR","ZIP_CODE","zip_code")

#preview records
head(df1)

gmculp/rNYCclean documentation built on July 14, 2020, 5:07 a.m.