knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(nzcensr)
library(sf)
library(knitr)
suppressPackageStartupMessages(library(tidyverse))

Introduction

The nzcensr package is a data package which makes it easy to import the New Zealand Census data as either normal or spatial dataframes without having to download the data for each project and perform different joins. The package contains the following data sets:

All of these data sets are provided at the meshblock, area unit, local board, territorial authority ("tas") and regional spatial level. They all follow the same regular naming convention of data set name with the spatial area following e.g.

dwelling_area_units

or

individual_part_3b

All of the data sets are lazily loaded which means that they are only brought into memory when called; not when the package is just loaded.

Exploring the census -- the data available.

A key idea behind this package is the idea of making it easier to access and know what data is available and then to transform it. The function nz_census_tables makes it easy to explore the data. Simply calling it without any arguments returns a table with all of the data frames available, and a small note explaining what they are:

kable(nz_census_tables())

This function also accepts the input of a table which returns all of the unique topics in the data set, and whether to include the variables or not.

kable(nz_census_tables("dwelling_area_units"))

or (table shows just the first ten for presentation)

nz_census_tables("dwelling_area_units", variables = TRUE) %>% 
  slice(1:10) %>% 
  kable()

Religious affiliation in Auckland, New Zealand

Let's have a look at religious association in Auckland, New Zealand. Firstly, let's select the topic using the keyword religion, and then transform into the long format that is 'cleaned' and replace the confidential values with 1. By cleaned, I mean that the census columns are split up into year, topic and variable columns.

religious_association_nz_region <- 
  select_by_topic(individual_part_2_regions, "religious") %>% 
  filter_by_census_area("regions", "regions", "Auckland") %>% 
  transform_census(gis = FALSE, long = TRUE, 
                   clean = TRUE)

The above workflow essentially encapsulates the workflow brought to the user by the nzcensr package which tries to make it as easy to access the data, select the topics wanted, filter by regions and transform in a 'tidy' table. The output of this is the following table (top five rows only):

religious_association_nz_region %>% 
  head(5) %>% 
  kable()

Cool, that is all tidy.

Let's make some plots!

For interests sake, let us create a variable of percentage of people as opposed to absolute numbers. To do this, we create two tables of the religions and the totals and then join them back to each other based on the region. Then we can simply divide the religion number by the total and create a percentage.

religious_association_nz_region_wanted_variables <- 
  filter(religious_association_nz_region,
         !(variable %in% c("Not Elsewhere Included(3)", 
                           "Total people",
                           "Total people stated",
                           "Object to Answering",
                           "Other Religions",
                           "Spiritualism and New Age Religions"))) %>% 
  rename(number_people = value)

religious_association_nz_region_total_stated <- religious_association_nz_region %>% 
  filter(variable == "Total people stated") %>% 
  select(Area_Code_and_Description, year, total_number_people = value)

religious_association_nz_region_percent <- 
  left_join(religious_association_nz_region_wanted_variables,
            religious_association_nz_region_total_stated) %>% 
  mutate(percentage_people = round(number_people / total_number_people, 4) * 100,
         Description = str_replace(Description, " Region", "")) %>% 
  select(-topic)

Now to create a line plot to investigate the trends the three time periods.

ggplot(religious_association_nz_region_percent) +
  geom_line(aes(x = year, y = percentage_people, colour = variable, group = variable)) +
  scale_colour_discrete(name = "Religion") +
  ggtitle("Religion as a percentage of population in Auckland from\n2001 to 2006 to 2013") +
  ylab("Percent") +
  xlab("Year") + 
  theme_minimal() +
  theme(legend.position = "top")

Interesting, in these plots you can clearly see the 'decline' of Christianity occuring and the increasing 'lack of religion'. Clearly, these two categories dwarf the others, and it would be interesting to drop these out to see more of the minor religions.

religious_association_nz_region_percent_minors <- 
  filter(religious_association_nz_region_percent,
         !(variable %in% c("Christian", "No Religion")))

ggplot(religious_association_nz_region_percent_minors) +
  geom_line(aes(x = year, y = percentage_people, colour = variable, group = variable)) +
  facet_wrap(~Description, scales = "free_y") +
  scale_colour_discrete(name = "Religion") +
  ggtitle("Religion as a percentage of population in Auckland from\n2001 to 2006 to 2013") +
  ylab("Percent") +
  xlab("Year") + 
  theme_minimal() +
  theme(legend.position = "top")

Some interesting things going on here! Anyway, I hope I have demonstrated how easy it is to analyse NZ census data with the nzcensr package. It's brand new, and my first package, so please let me know if you find any issues or bugs.



phildonovan/nzcensr documentation built on May 12, 2019, 7:20 p.m.