knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

statfitools

The statfitools is a collection of functions to help working with a data from Statistics Finland. I have writen functions for my own use, but I am happy if someone else finds functions useful. Some of the functions are spesific to the data from Statistics Finland, others have more general use.

To download the data from the Statistics Finland use pxweb.

Installation

To install the package:

install.packages("devtools")
devtools::install_github("jhuovari/statfitools")

Usage

library("statfitools")
library("dplyr")

Get data

Get formated data from statfi

dat <- statfi_get_data(
  "https://statfin.stat.fi/PXWeb/api/v1/fi/StatFin/tyti/statfin_tyti_pxt_11pk.px/",
  list(
    Vuosi = c("*"),
    Sukupuoli = c("SSS"),
    Tiedot = c("*")
  )
)

str(dat)

Preprocess data

Make legal names

Try to make more readable valid names than make.names()

# install.packages("pxweb")

dat <- pxweb::pxweb_get_data(
  url = "https://statfin.stat.fi/PXWeb/api/v1/fi/StatFin/tyokay/statfin_tyokay_pxt_115b.px",
  query = list(
    Alue = c('SSS'),
    "Pääasiallinen toiminta" = c('*'),
    Sukupuoli = c('SSS'),
    "Ikä" = c('SSS'),
    Vuosi = c('*'),
    Tiedot = c('*')))

names(dat)

dat <- clean_names(dat)

names(dat)

Extract code or name from a code-name string

extract_code("508 Mantta-Vilppula")
extract_name("508 Mantta-Vilppula")

Work with classifications

Work with a statfi regional data

Clean regional names

Statistics Finland uses different formats to present regional names. Make them uniform.

TODO

Recode and aggregate regional data

Available municipality based regional classifications from Statistics Finland.

names(sf_get_reg_keytable(NULL))

Aggregate to Tilastollinen kuntaryhmitys.

Download classification key and data. Join and aggregate

# Classification key
key_kuntar <- sf_get_reg_keytable("Kuntaryhmä")

# Data
dat_ku <- pxweb::get_pxweb_data(
  url = "http://pxnet2.stat.fi/PXWeb/api/v1/fi/StatFin/vrm/tyokay/010_tyokay_tau_101.px",
  dims = list(
    Alue = c('*'),
    "Pääasiallinen toiminta" = c('11'),  # Työlliset
    Sukupuoli = c('S'),
    "Ikä" = c('SSS'),
    Vuosi = c('*')),
  clean = TRUE) %>% 
  clean_names() %>% 
  clean_times()

# Join and aggregate
dat_kuntar <- dat_ku %>% 
  # safer to use codes
  mutate(ku_code = sf_name2code(Alue, class = "kunta", year = 2016)) %>%   
  left_join(key_kuntar, by = c(ku_code = "Knro")) %>% 
  group_by(Kuntaryhma, time, Paaasiallinen_toiminta) %>% 
  summarise(values = sum(values, na.rm = TRUE)) %>% 
  ungroup()


pttry/statfitools documentation built on Feb. 2, 2025, 1:50 a.m.