An R package to assist with the downloading / importing / manipulation of the (Australian) Geocoded - National Address File (G-NAF).
Addresses are a cultural artefact, created from language rather than rules and legislation *
G-NAF is Australia's most trusted authoritative (g)eocoded - (n)ational (a)ddress (f)ile.
More from: https://psma.com.au/product/gnaf/
PSMA's G-NAF dataset contains all physical addresses in Australia. It's the most trusted source of geocoded addresses for Australian businesses and governments.
Before use, users should read the G-NAF End User Licence Agreement
G-NAF is released on a quarterly basis and is available from here.
Please note, the package is not on CRAN.
Installing from GitHub:
# Install `remotes` if it isn't already installed.
if(!any(installed.packages()[,1] == "remotes")) install.packages("remotes")
# Install the `gnaf.r` package.
remotes::install_github("KyleHaynes/gnaf.r")
The following three steps can be completed manually or with the function call get_gnaf()
(see below example).
# Load the package.
library("gnaf.r")
# Steps 1-3 in the `Prerequisite steps` section above can be completed from within R.
# Note: If G-NAF is already downloaded, you can skip this function call.
# Download and unpack G-NAF to the "c:/temp/" folder.
get_gnaf(dest_folder = "c:/temp")
# Verbose output example:
# ------------------
# The download is approximately 1.5Gb, depending on your internet speed, the
# following may take a while.
# The G-NAF zip file is currently being downloaded to: C:\temp\feb20_gnaf_pipeseparatedvalue.zip
# ------------------
# G-NAF has been download and is now uncompressing.
# ------------------
# You can now call the `setup()` to begin the initial setup of G-NAF. Be sure to toggle the
# `states` argument to only import relevant jurisdictions.
# Example setup call: setup(dir = "C:\\temp\\G-NAF\\G-NAF NOVEMBER 2020", states = "qld")
# Setup the session before importing G-NAF. This step has two primary purposes.
# 1. Define the location of the G-NAF (month year) root path (./G-NAF <MONTH> <YEAR>).
# 2. Define which jurisdictions to import (case insensitive regex on State abbreviations).
setup(dir = "C:/temp/G-NAF/G-NAF NOVEMBER 2020", states = "qld")
# Import G-NAF for Queensland.
gnaf <- build_gnaf()
# Import again, defining `simple = TRUE` to remove potential non-address related
# variables (i.e reduce the output to just address information).
gnaf_simple <- build_gnaf(simple = TRUE)
# Inspect the stucture of each object.
str(gnaf)
# Classes ‘data.table’ and 'data.frame': 590395 obs. of 48 variables:
# $ ADDRESS_DETAIL_PID : chr "GAACT714845933" "GAACT714845934" "GAACT714845935" "GAACT714845936" ...
# $ BUILDING_NAME : chr "" "" "" "" ...
# $ LOT_NUMBER : int NA NA NA NA NA NA NA NA NA NA ...
# $ FLAT_NUMBER_PREFIX : chr "" "" "" "" ...
# $ FLAT_TYPE : chr NA NA NA NA ...
# $ FLAT_NUMBER : int NA NA NA NA NA NA NA NA NA NA ...
# $ FLAT_NUMBER_SUFFIX : chr "" "" "" "" ...
# $ LEVEL_TYPE : chr NA NA NA NA ...
# $ LEVEL_NUMBER_PREFIX : chr "" "" "" "" ...
# $ LEVEL_NUMBER : int NA NA NA NA NA NA NA NA NA NA ...
# $ NUMBER_FIRST_PREFIX : chr NA NA NA NA ...
# $ NUMBER_FIRST : int 6 3 26 17 5 24 7 5 22 9 ...
# $ NUMBER_FIRST_SUFFIX : chr "" "" "" "" ...
# $ NUMBER_LAST : int NA NA NA NA NA NA NA NA NA NA ...
# $ NUMBER_LAST_SUFFIX : chr NA NA NA NA ...
# $ STREET_NAME : chr "PACKHAM" "BUNKER" "JAUNCEY" "GEEVES" ...
# $ STREET_TYPE : chr "PLACE" "PLACE" "COURT" "COURT" ...
# $ STREET_SUFFIX : chr NA NA NA NA ...
# $ LOCALITY_NAME : chr "CHARNWOOD" "CHARNWOOD" "CHARNWOOD" "CHARNWOOD" ...
# $ STATE_NAME : chr "AUSTRALIAN CAPITAL TERRITORY" "AUSTRALIAN CAPITAL TERRITORY" "AUSTRALIAN CAPITAL TERRITORY" "AUSTRALIAN CAPITAL TERRITORY" ...
# $ POSTCODE : int 2615 2615 2615 2615 2902 2615 2902 2615 2615 2902 ...
# $ LONGITUDE : num 149 149 149 149 149 ...
# $ LATITUDE : num -35.2 -35.2 -35.2 -35.2 -35.4 ...
# $ MB_2011_CODE : chr "80006300000" "80006310000" "80006380000" "80006280000" ...
# $ MB_2016_CODE : chr "80006300000" "80006310000" "80006380000" "80006280000" ...
# $ STREET_LOCALITY_PID : chr "ACT3857" "ACT3807" "ACT3833" "ACT3826" ...
# $ LOCALITY_PID : chr "ACT570" "ACT570" "ACT570" "ACT570" ...
# $ ALIAS_PRINCIPAL : chr "P" "P" "P" "P" ...
# $ LEGAL_PARCEL_ID : chr "BELC/CHAR/15/16/" "BELC/CHAR/17/2/" "BELC/CHAR/83/3/" "BELC/CHAR/29/9/" ...
# $ CONFIDENCE : int 2 2 2 2 2 2 2 2 2 2 ...
# $ ADDRESS_SITE_PID : int 710446419 710446420 710446421 710446422 710446424 710446425 710446427 710446428 710446429 710446430 ...
# $ LEVEL_GEOCODED_CODE : int 7 7 7 7 7 7 7 7 7 7 ...
# $ GNAF_PROPERTY_PID : chr "1026280" "1026283" "351430" "343650" ...
# $ PRIMARY_SECONDARY : chr "" "" "" "" ...
# $ PRIMARY_POSTCODE : int NA NA NA NA NA NA NA NA NA NA ...
# $ GNAF_LOCALITY_PID : int 500219587 500219587 500219587 500219587 500219628 500219587 500219628 500219587 500219587 500219628 ...
# $ GNAF_RELIABILITY_CODE : int 5 5 5 5 5 5 5 5 5 5 ...
# $ GNAF_STREET_PID : int 502493439 502490407 502492206 502491587 502492926 502492206 502492926 502490407 502492206 502492926 ...
# $ GNAF_STREET_CONFIDENCE : int 2 2 2 -1 2 2 2 2 2 2 ...
# $ GNAF_RELIABILITY_CODE_street_locality: int 4 4 4 4 4 4 4 4 4 4 ...
# $ ADDRESS_DEFAULT_GEOCODE_PID :integer64 3006501997 3006502410 3006610521 3006506877 3006499300 3006448778 3006616267 3006485909 ...
# $ GEOCODE_TYPE_CODE : chr "FCS" "FCS" "FCS" "FCS" ...
# $ ADDRESS_MESH_BLOCK_2011_PID : chr "ACT43994755" "ACT43994756" "ACT43994757" "ACT43994758" ...
# $ MB_MATCH_CODE : int 1 1 1 1 1 1 1 1 1 1 ...
# $ ADDRESS_MESH_BLOCK_2016_PID : chr "ACT1547490736" "ACT1547490737" "ACT1547490738" "ACT1547490739" ...
# $ MB_MATCH_CODE_locality : int 1 1 1 1 1 1 1 1 1 1 ...
# $ LOCALITY_CLASS : chr "GAZETTED LOCALITY" "GAZETTED LOCALITY" "GAZETTED LOCALITY" "GAZETTED LOCALITY" ...
# $ STREET_CLASS : chr "CONFIRMED" "CONFIRMED" "CONFIRMED" "CONFIRMED" ...
# - attr(*, ".internal.selfref")=<externalptr>
# - attr(*, "sorted")= chr "ADDRESS_DETAIL_PID"
str(gnaf_simple)
# Classes ‘data.table’ and 'data.frame': 590395 obs. of 25 variables:
# $ ADDRESS_DETAIL_PID : chr "GAACT714845933" "GAACT714845934" "GAACT714845935" "GAACT714845936" ...
# $ BUILDING_NAME : chr "" "" "" "" ...
# $ LOT_NUMBER : int NA NA NA NA NA NA NA NA NA NA ...
# $ FLAT_NUMBER_PREFIX : chr "" "" "" "" ...
# $ FLAT_TYPE : chr NA NA NA NA ...
# $ FLAT_NUMBER : int NA NA NA NA NA NA NA NA NA NA ...
# $ FLAT_NUMBER_SUFFIX : chr "" "" "" "" ...
# $ LEVEL_TYPE : chr NA NA NA NA ...
# $ LEVEL_NUMBER_PREFIX: chr "" "" "" "" ...
# $ LEVEL_NUMBER : int NA NA NA NA NA NA NA NA NA NA ...
# $ NUMBER_FIRST_PREFIX: chr NA NA NA NA ...
# $ NUMBER_FIRST : int 6 3 26 17 5 24 7 5 22 9 ...
# $ NUMBER_FIRST_SUFFIX: chr "" "" "" "" ...
# $ NUMBER_LAST : int NA NA NA NA NA NA NA NA NA NA ...
# $ NUMBER_LAST_SUFFIX : chr NA NA NA NA ...
# $ STREET_NAME : chr "PACKHAM" "BUNKER" "JAUNCEY" "GEEVES" ...
# $ STREET_TYPE : chr "PLACE" "PLACE" "COURT" "COURT" ...
# $ STREET_SUFFIX : chr NA NA NA NA ...
# $ LOCALITY_NAME : chr "CHARNWOOD" "CHARNWOOD" "CHARNWOOD" "CHARNWOOD" ...
# $ STATE_NAME : chr "AUSTRALIAN CAPITAL TERRITORY" "AUSTRALIAN CAPITAL TERRITORY" "AUSTRALIAN CAPITAL TERRITORY" "AUSTRALIAN CAPITAL TERRITORY" ...
# $ POSTCODE : int 2615 2615 2615 2615 2902 2615 2902 2615 2615 2902 ...
# $ LONGITUDE : num 149 149 149 149 149 ...
# $ LATITUDE : num -35.2 -35.2 -35.2 -35.2 -35.4 ...
# $ MB_2011_CODE : chr "80006300000" "80006310000" "80006380000" "80006280000" ...
# $ MB_2016_CODE : chr "80006300000" "80006310000" "80006380000" "80006280000" ...
# - attr(*, ".internal.selfref")=<externalptr>
# - attr(*, "sorted")= chr "ADDRESS_DETAIL_PID"
# Size of each object (gigabytes).
format(object.size(gnaf), units = "Gb")
# [1] "0.3 Gb"
format(object.size(gnaf_simple), units = "Gb")
# [1] "0.1 Gb"
# Attempt to build the entire country (including Other Territories: "OT").
setup(dir = "C:/temp/G-NAF/G-NAF FEBRUARY 2020", states = "")
# Import all jurisdictions.
gnaf <- build_gnaf()
# Dimensions of output.
dim(gnaf)
# [1] 15271641 52
# Object size.
format(object.size(gnaf), units = "Gb")
# [1] "8.9 Gb"
# Frequency table by State.
gnaf[, .N, STATE_NAME]
# STATE_NAME N
# 1: AUSTRALIAN CAPITAL TERRITORY 242999
# 2: NEW SOUTH WALES 4749707
# 3: NORTHERN TERRITORY 113221
# 4: OTHER TERRITORIES 4362
# 5: QUEENSLAND 3219900
# 6: SOUTH AUSTRALIA 1163320
# 7: TASMANIA 347396
# 8: VICTORIA 3886769
# 9: WESTERN AUSTRALIA 1543967
Issues / Bugs / Suggestions: https://github.com/KyleHaynes/gnaf.r/issues
G-NAF ©PSMA Australia Limited licensed by the Commonwealth of Australia under the Open Geo-coded National Address File (G-NAF) End User Licence Agreement.
Incorporates or developed using G-NAF ©PSMA Australia Limited licensed by the Commonwealth of Australia under the Open Geo-coded National Address File (G-NAF) End User Licence Agreement.
Special thanks to the Turnbull Government for the innovative and invaluable step in making this data open to all Australians.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.