pad_addr: NYC DCP PAD address match on a data frame of NYC addresses.

Description Usage Arguments Value Examples

View source: R/pad_addr.R

Description

The pad_addr function performs a substring match between a data frame of NYC addresses and NYC Department of City Planning's (DCP) PAD (Property Address Directory) and returns the PAD address, if available.

Usage

1
2
pad_addr(in_df, new_addr_col_name, addr_col_name, third_col_name, 
    third_col_type, return_type = "all")

Arguments

in_df

a data frame containing NYC addresses. Required.

new_addr_col_name

the name of the output addresses column as string. Required.

addr_col_name

the name of the input addresses column as string. Required.

third_col_name

the name of either the borough code or zip code column as string. Required.

third_col_type

either "boro_code" or "zip_code" as string. Required.

return_type

option to exclude address which failed to match from output as string. Optional.

Value

A data frame containing the input data frame plus the PAD address column.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# create a data frame of addresses
ADDR <- c("80 CENTRE","125 WORTH S","42 09 28 S","253 BROADW",
    "620 ATLANT","125 WOR","1 FRANKLIN","1 FRANKLIN",
    "1 1 1 AVE","1 1 1 AVE")
BORO_CODE <- c(1,1,4,1,3,1,3,3,1,1)
ZIP_CODE <- c('10013','10013','11101','10007','11217','10013',
    '11222','11249','10003','10014')
u_id <- 1:length(ADDR)
df = data.frame(u_id, ADDR, BORO_CODE, ZIP_CODE)

#get version of DCP PAD used to build package data
rNYCclean::pad_version

#get PAD address using borough code
#NOTE: slow due to expansive search area (entire borough)
system.time({df1 <- pad_addr(df,"ADDR.pad","ADDR","BORO_CODE","boro_code")})

#preview records
head(df1)

#get PAD address using ZIP code
#NOTE: much faster due to localized search area (single ZIP code)
system.time({df2 <- pad_addr(df,"ADDR.pad","ADDR","ZIP_CODE","zip_code")})

#preview records
head(df2)

gmculp/rNYCclean documentation built on July 14, 2020, 5:07 a.m.