knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Under Construction

Using ABS structures

Loading the package will lazily load a number of structures, a full list is available in the reference

# use this to make sure README runs with current build
# rather than installed version
devtools::load_all()
library(dplyr, warn.conflicts = FALSE)
library(ggplot2, warn.conflicts = FALSE)
library(strayr)
library(dplyr)
library(ggplot2)
glimpse(anzsco2009)
glimpse(anzsic2006)
glimpse(anzsic_isic)
glimpse(asced_foe2001)
glimpse(asced_qual2001)

Objects stored in the absmapsdata package can be accessed with the read_absmap function:

library(sf) # loaded to handle sf objects
read_absmap("sa42016")

Converting state names and abbreviations

The clean_state() function makes it easy to wrangle vectors of State names and abbreviations - which might be in different forms and possibly misspelled.

Let's start with a character vector that includes some misspelled state names, some correctly spelled state names, as well as some abbreviations both malformed and correctly formed.

x <- c("western Straya", "w. A ", "new soth wailes", "SA", "tazz", "Victoria",
       "northn territy")

To convert this character vector to a vector of abbreviations for State names, use clean_state():

clean_state(x)

If you want full names for the states rather than abbreviations:

clean_state(x, to = "state_name")

By default, clean_state() uses fuzzy or approximate string matching to match the elements in your character vector to state names/abbreviations. If you only want to permit exact matching, you can disable fuzzy matching. This means you will never get false matches, but you will also fail to match misspelled state names or malformed abbreviations; you'll get an NA if no match can be found.

 clean_state(x, fuzzy_match = FALSE)

If your data is in a data frame, clean_state() works well within a dplyr::mutate() call:

 x_df <- data.frame(state = x, stringsAsFactors = FALSE)

library(dplyr)
 x_df %>% 
   mutate(state_abbr = clean_state(state))

The function clean_state can also return an 'unofficial' state/territory colour for use in charts.

clean_state("Queensland", to = "colour")

The palette palette_state_name_2016 can be used in ggplot2 for the unofficial colours of states.

read_absmap("state2016") %>% 
    ggplot() + 
    geom_sf(aes(fill = state_name_2016), colour = NA) +
    scale_fill_manual(values = palette_state_name_2016) +
    theme_void()

Australian public holidays

This package includes the auholidays dataset from the Australian Public Holidays Dates Machine Readable Dataset as well as a helper function is_holiday:

str(auholidays)


is_holiday('2020-01-01')
is_holiday('2019-05-27', jurisdictions = c('ACT', 'TAS'))

h_df <- data.frame(dates = c('2020-01-01', '2020-01-10'))

h_df %>%
  mutate(IsHoliday = is_holiday(dates))

Parsing income ranges

The parse_income_range function provides some tools for extracting numbers from income ranges commonly used in Australian data. For example:

parse_income_range("$1-$199 ($1-$10,399)", limit = "lower")
parse_income_range("$1-$199 ($1-$10,399)", limit = "upper")
parse_income_range("$1-$199 ($1-$10,399)", limit = "mid")

parse_income_range("e. $180,001 or more", limit = "upper")
parse_income_range("e. $180,001 or more", limit = "upper", max_income = 300e3)


parse_income_range("Nil income")
parse_income_range("Negative income")
parse_income_range("Negative income", negative_as_zero = FALSE)


tibble(income_range = c("Negative income",
                        "Nil income",
                        "$1,500-$1,749 ($78,000-$90,999)",
                        "$1,750-$1,999 ($91,000-$103,999)",
                        "$2,000-$2,999 ($104,000-$155,999)",
                        "$3,000 or more ($156,000 or more)")) %>% 
  mutate(lower = parse_income_range(income_range),
         mid   = parse_income_range(income_range, limit = "mid"),
         upper = parse_income_range(income_range, limit = "upper"))


runapp-aus/abscorr documentation built on Oct. 15, 2024, 2:18 p.m.