knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

The z11 package provides an R-interface to the geospatial raster data of the German Census 2011. z11 can be used to list all available attributes from the Census (1 km² attributes and 1-hectare attribute) and import them either as simple features spatial points object or as raster object with the resolution of the corresponding attribute. I felt that such a package may be of interest for other userRs since the Census data 'only' exists as CSV data dump on the internet. However, to use them in a Geographic Information System (GIS), the data must be prepared, which is time-consuming and computationally demanding. As such, the package aims to fill the gap of making the data easily accessible by providing a straightforward method.

Generally, the German Census 2011 data are available under a Data licence Germany – attribution – version 2.0 and can be manipulated and openly shared. Yet, as part of this package, use them at your own risk and do not take the results of the functions for granted.

Installing and Loading the Package

remotes::install_github("StefanJuenger/z11")

The functions in the package rely on data prepared in another repository. If you want to work locally, please download them first and follow the instructions of the manual.

After installing, the package can be loaded using R's standard method:

library(z11)

Working with 1 km² German Census 2011 Data

1 km² data were the first data published in 2015. While they comprise fewer attributes than the 1 hectare ones (see below), the data are in a way more easy to use format (see also the z11 package's vignette about the initial data preparation [still in the works]). You can browse all available attributes by using the following function.

z11::z11_list_1km_attributes()

Thus, in sum we can use r length(z11::z11_list_1km_attributes()) different census attributes of the size of 1 km². Please note that some of the attributes are also duplicated, denoted by the _cat-suffix in the attribute name. These attributes may be less precise, but they comprise fewer missing values than the continuous attributes. In any case, for more details about the actual attributes, refer to the official documentation of the German Census 2011 at https://www.zensus2011.de.

Now, if we want to load one of the attributes as a raster layer, we can use the z11::z11_get_1km_attribute() function. For example, for importing information on immigrant rates on a 1 km² grid level, we could use the following command:

immigrants_1km <-
  z11::z11_get_1km_attribute(Auslaender_A)

Voilà, we got this information as a standard raster layer (class("Raster")) as specified in the raster package:

immigrants_1km

Therefore, we can also use the raster package's standard plotting procedures:

raster::plot(immigrants_1km)

Suppose we do not want to work with raster data. Instead, we aim to use the raster grid cell's centroid coordinates. In that case, we can also load the data as a simple features data frame with point geometries as specified in the sf package by simply using the option as_raster:

immigrants_1km_sf <-
  z11::z11_get_1km_attribute(Auslaender_A, as_raster = FALSE)

Here we go:

immigrants_1km_sf

Working with 1 hectare German Census 2011 Data

In 2018, destatis also published Census data on a 1-hectare grid level. This 100m $\times$ 100m data provides a more fine-grained level of information for Germany's population. Moreover, the data comprise way more attributes than the 1 km² one, including details on buildings and even heater types. Again, we can list all available attributes using a specialized function:

z11::z11_list_100m_attributes()

As we can see, the number of attributes is quite high, in sum r length(z11::z11_list_100m_attributes()). The naming convention is different as the names depict a concatenated string between each attribute name and its category. In contrast to the 1 km² data, the attribute values are the number of persons within each 1-hectare grid cell sharing the attribute's characteristic. Again, please refer to the official German Census 2011 documentation since it details how to handle the data and interpret its values. The data here are just vanilla Census data. However, they were converted from a long data format to a wide one for rasterizing.

Speaking of rasterizing: the 1-hectare data can again be imported by using a specialized function. For example, to import the number of immigrants attribute, we can use this command:

immigrants_100m <-
  z11::z11_get_100m_attribute(STAATSANGE_KURZ_2)

The operation may take a bit longer, as the 1-hectare data are bigger and therefore computationally also a bit more expensive. Fortunately, through their flat data structure, raster data files are usually a bit more comfortable to process. Plotting 1-hectare data is, therefore, also straightforward:

raster::plot(immigrants_100m)

If you wish to work with the data in a vector format, you can also use the as_raster option in the z11::z11_get_100m_attribute() function:

immigrants_100m_sf <-
  z11::z11_get_100m_attribute(STAATSANGE_KURZ_2, as_raster = FALSE)

See:

immigrants_1km_sf

That's it. There's not yet much more to tell at the moment.

Future

The package itself should not provide many more functionalities, rather than listing and importing the German Census 2011 data. However, few things are still a bit rough around the edges and hopefully should be improved in the future:

Speed

In some cases (especially in the 1-hectare data), the functions of the package could be a bit quicker in computing. I did my best to reduce the input data amount, but a simple database format would still have some benefits.

Metadata

What is missing in the package is the documentation of the census. It would indeed be nice to have a more comprehensive overview of all attributes and their descriptions. Also, the variable names are sometimes not helpful. At the moment, I am still undecided if I really want the change the latter since they resemble the names from the original input data. The goal of the package is not necessarily data harmonization of the German Census 2011.

Sources



StefanJuenger/z11 documentation built on July 7, 2022, 2:39 p.m.