Description Usage Arguments Value Examples
The regex_addr
function performs string replacement cleaning on a data frame of NYC addresses with a look-up dataset of locations. The locations dataset was constructed from NYC Department of City Planning's (DCP) PAD (Property Address Directory) and SND (Street Name Dictionary). In addition, the function attempts to reconcile addresses containing post office box information or indicators of missing addresses (e.g., "UNKNOWN", "HOMELESS").
1 2 | regex_addr(in_df, new_addr_col_name, addr1_col_name,
addr2_col_name = NULL)
|
in_df |
a data frame containing NYC addresses. Required. |
new_addr_col_name |
the name of output addresses column as string. Required. |
addr1_col_name |
the name of the input address line one column as string. Required. |
addr2_col_name |
the name of the input address line two column as string. Optional. |
A data frame containing the input data frame plus the cleaned address column.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | # create a data frame of addresses
ADDR1 <- c("80 CENTRE S","125 WORTH S","42-09 28 ST",
"250 BEDFORD PARK BLV","30 LAFAYETTE A","125","1545 ATLANTIC")
ADDR2 <- c("","UNIT 329","1st FLR","SUITE 212B","ROOM 3","WORTH STREET","")
BORO_CODE <- c(rep(1,length(ADDR1)-1),3)
u_id <- 1:length(ADDR1)
df = data.frame(u_id, ADDR1, ADDR2, BORO_CODE)
#get version of DCP PAD used to build package data
rNYCclean::pad_version
#one address input column
df1 <- regex_addr(in_df = df, new_addr_col_name = "regex.ADDR",
addr1_col_name = "ADDR1")
#preview records
head(df1)
#two address input column
df2 <- regex_addr(in_df = df, new_addr_col_name = "regex.ADDR",
addr1_col_name = "ADDR1", addr2_col_name = "ADDR2")
#preview records
head(df2)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.