library(ggplot2)
library(knitr)
library(datakindr)

opts_chunk$set(fig.width=6, fig.height=4)
opts_chunk$set(echo = F, warning = F, error = F, message = F)

What Is It For?

Datakindr provides tools, utilities, data and more for use by the DataKind Dublin team. It includes a convenient interface to CSO.ie's Statbank, a ggplot2 theme and colour palettes for convenient & consistent visualisations.

The package is in the early stages of development but it is intended that this package will become a kind of "Swiss Army Knife". It should mean easy and frictionless solving of commonly encountered problems for Datakind volunteers (e.g., querying the CSO's Statbank).

What Does It Do?

Currently there are two main tasks that datakindr makes easier:

  1. Querying the CSO's Statbank API.
  2. Making consistent & beautiful visualisations.

CSO and Statbank

There are two functions available for this task:

Visualisations

There are two sets of data objects here for helping volunteers create consistent plots and other visualisations.

ggplot(data.frame( y = runif(100, min = 0, max = 20) +
                     seq(1, 100, 1),
                   x = seq(1, 100, 1),
                   z = rep_len(c("a", "b", "c", "d"), 100)),
       aes(x, y, colour = z, fill = z)) +
  geom_point() +
  facet_wrap(~z, nrow = 1) +
  geom_smooth(se = TRUE) +
  labs(title = "Some Data (2011)",
       x = "Range", y = "Value") +
  dk_theme

How Do I Use It?

I'll let the code do the talking here.

Theme Demo

# install.packages("devtools")
# devtools::install_github("DataKind-DUB/datakindr")
# Development version: devtools::install_github("cormac85/datakindr")
library(datakindr)

data_example <- data.frame( x = c('Primary 1', 'Primary 2',
                                  'Secondary 1', 'Secondary 2'),
                            y = runif(4, 10, 100))

ggplot(data_example, aes(x,y, fill = x)) +
  geom_bar(stat = "identity") +
  labs(title = "Some Data (2011)",
       x = "Range", y = "Value") +
  dk_theme # <- Here's the theme!

Statbank Demo

library(dplyr)
dataset_names <- search_statbank_datasets("garda")
dataset_names[1,1]
dataset_names[1,2]
industry_population <- get_cso_dataset(dataset_names$dataset_code[1])

industry_population %>% 
  filter(grepl("Both", Sex)) %>%
  filter(grepl("1991", `Census Year`)) %>% 
  filter(grepl("services", `Detailed Industrial Group`)) %>% 
  select(`Detailed Industrial Group`, `Census Year`, value) %>% 
  arrange(desc(value))

Bringing It All Together

ggplot(industry_population %>% 
         filter(grepl("Garda|Defence", `Detailed Industrial Group`)) %>% 
         filter(grepl("Male|Female", Sex)) %>% 
         filter(!grepl("2016", `Census Year`)),
       aes(x = `Census Year`, y = value, fill = Sex)) + 
  facet_wrap(~`Detailed Industrial Group`, nrow = 1) +
  geom_bar(stat= "identity") +
  labs(title = "Population Working in Industrial Group", y = "Population") +
  dk_theme


cormac85/datakindr documentation built on May 13, 2019, 1:36 a.m.