README.md

statfitools

The statfitools is a collection of functions to help working with a data from Statistics Finland. I have writen functions for my own use, but I am happy if someone else finds functions useful. Some of the functions are spesific to the data from Statistics Finland, others have more general use.

To download the data from the Statistics Finland use pxweb.

Installation

To install the package:

install.packages("devtools")
devtools::install_github("jhuovari/statfitools")

Usage

library("statfitools")
library("dplyr")

Preprocess data

Make legal names

Try to make more readable valid names than make.names()


# install.packages("pxweb")

dat <- pxweb::get_pxweb_data(
  url = "http://pxnet2.stat.fi/PXWeb/api/v1/fi/StatFin/vrm/tyokay/010_tyokay_tau_101.px",
  dims = list(
    Alue = c('SSS'),
    "Pääasiallinen toiminta" = c('*'),
    Sukupuoli = c('S'),
    "Ikä" = c('SSS'),
    Vuosi = c('*')),
  clean = TRUE)

names(dat)
#> [1] "Alue"                   "Pääasiallinen toiminta"
#> [3] "Vuosi"                  "Ikä"                   
#> [5] "Sukupuoli"              "values"

dat <- clean_names(dat)

names(dat)
#> [1] "Alue"                   "Paaasiallinen_toiminta"
#> [3] "Vuosi"                  "Ika"                   
#> [5] "Sukupuoli"              "values"

Extract code or name from a code-name string


extract_code("508 Mantta-Vilppula")
#> [1] 508
extract_name("508 Mantta-Vilppula")
#> [1] "Mantta-Vilppula"

Work with classifications

Work with a statfi regional data

Clean regional names

Statistics Finland uses different formats to present regional names. Make them uniform.

TODO

Recode and aggregate regional data

Available municipality based regional classifications from Statistics Finland.


names(sf_get_reg_keytable(NULL))
#>  [1] "Knro"                  "Kunta"                
#>  [3] "Kommun"                "Mkkoodi"              
#>  [5] "Maakunta"              "Landskap"             
#>  [7] "Region"                "Avi_koodi"            
#>  [9] "AVI"                   "RFV"                  
#> [11] "AVI.1"                 "Ely_koodi"            
#> [13] "ELY_keskus"            "ELY_central"          
#> [15] "ELY_Centre"            "Seutukuntakoodi"      
#> [17] "Seutukunta"            "Ekonomisk_region"     
#> [19] "Suuraluekoodi"         "Suuralue"             
#> [21] "Storomrade"            "Major_region"         
#> [23] "Kuntaryhmakoodi"       "Kuntaryhma"           
#> [25] "Kommungrup"            "Municipal_group"      
#> [27] "Kielisuhdekoodi"       "Kielisuhde"           
#> [29] "Spraklig_indelning"    "Language_distribution"

Aggregate to Tilastollinen kuntaryhmitys.

Download classification key and data. Join and aggregate


# Classification key
key_kuntar <- sf_get_reg_keytable("Kuntaryhmä")

# Data
dat_ku <- pxweb::get_pxweb_data(
  url = "http://pxnet2.stat.fi/PXWeb/api/v1/fi/StatFin/vrm/tyokay/010_tyokay_tau_101.px",
  dims = list(
    Alue = c('*'),
    "Pääasiallinen toiminta" = c('11'),  # Työlliset
    Sukupuoli = c('S'),
    "Ikä" = c('SSS'),
    Vuosi = c('*')),
  clean = TRUE) %>% 
  clean_names() %>% 
  clean_times()

# Join and aggregate
dat_kuntar <- dat_ku %>% 
  # safer to use codes
  mutate(ku_code = sf_name2code(Alue, class = "kunta", year = 2016)) %>%   
  left_join(key_kuntar, by = c(ku_code = "Knro")) %>% 
  group_by(Kuntaryhma, time, Paaasiallinen_toiminta) %>% 
  summarise(values = sum(values, na.rm = TRUE)) %>% 
  ungroup()
#> Warning in left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y): joining
#> factors with different levels, coercing to character vector


pttry/statfitools documentation built on Feb. 2, 2025, 1:50 a.m.