knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
px_micro()
exists to support a specific use case for Statistics Greenland.
They use it to create small PX-files to showcase and present metadata from a lager data set which cannot be made publicly available. See an example on Statistic Greenland's microdata for Research and Analysis.
px_micro()
Apart from px_save()
, px_micro()
is the only other function that can save px objects as PX-files.
px_micro()
turns a px object into many smaller PX-files, each containing a subset of the variables in the original px object.
The basis of micro files are usually a data set which doesn't have a count variable (like most PX-files). px_micro()
will instead create a count of each individual variable.
In this example we will use the built-in data data set greenlanders
.
set.seed(0) micro_dir <- file.path("micro_files") unlink(micro_dir, recursive = TRUE)
library(pxmake) greenlanders |> dplyr::sample_n(10) |> dplyr::arrange_all()
Create a px object with px()
, and pass it to px_micro()
.
# Create px object x <- px(greenlanders) # Create folder for micro files micro_dir <- file.path("micro_files") dir.create(micro_dir) # Write micro files to folder px_micro(x, out_dir = micro_dir)
The folder now contains three PX-files, one for each variable except 'age'.
list.files(micro_dir)
The reason 'age' didn't get a PX-file is because it is the HEADING variable in x
, and px_micro()
creates a file for each non-HEADING variable. Instead the HEADNING variable is used in all the created PX-files.
# Print HEADING variables px_heading(x) # Print non-HEADING variables c(px_stub(x), px_figures(x))
In this case, we want 'cohort' to be heading, and to create a PX-file for 'gender', 'age' and 'municipality'.
x2 <- x |> px_stub('age') |> # Change age to STUB px_heading('cohort') # Change cohort to HEADING
# Clear folder unlink(file.path(micro_dir, "*.px")) px_micro(x2, out_dir = micro_dir)
The folder now contains the files we wanted.
list.files(micro_dir)
Each file contains one of the three variables as STUB, 'cohort' as HEADING, and a variable 'n' which is the count of each combination of the variables.
px(file.path(micro_dir, 'age.px'))$data px(file.path(micro_dir, 'gender.px'))$data px(file.path(micro_dir, 'municipality.px'))$data
In general the keyword values from the px object are carried over to the micro files. This is the case for keywords like 'MATRIX', 'SUBJECT-CODE', 'CONTACT', 'LANGUAGE', 'CODEPAGE', etc.
To change keywords across all the micro files, the easiest is to change them in the px object before calling px_micro()
.
# Change CONTACT in all micro files x2 |> px_contact("Johan Ejstrud") |> px_micro(out_dir = micro_dir)
However, some keywords need to be changed individually for each micro file. To do so, create a data frame with the column 'variable' and a column for each px keyword to change.
individual_keywords <- tibble::tribble(~variable , ~px_description, "age" , "Age count 18-99", "gender" , "Gender count", "municipality", "Municipality 2024" )
Supply this dataframe to the keyword_values
argument of px_micro()
.
px_micro(x2, out_dir = micro_dir, keyword_values = individual_keywords)
DESCRIPTION is changed in the micro files:
px(file.path(micro_dir, 'age.px')) %>% px_description() px(file.path(micro_dir, 'gender.px')) %>% px_description() px(file.path(micro_dir, 'municipality.px')) %>% px_description()
For multilingual files add a 'language' column to keyword_values
.
x3 <- x2 |> px_language("en") |> px_languages(c("en", "kl")) individual_keywords_ml <- tibble::tribble( ~variable, ~language, ~px_description, ~px_matrix, "age", "en", "Age count 18-99", "AGE", "age", "kl", "Ukiut 18-99", NA, "gender", "en", "Gender count", "GEN", "gender", "kl", " Suiaassuseq", NA, "municipality", "en", "Municipality 2024", "MUN", "municipality", "kl", "Kommuni 2024", NA ) px_micro(x3, out_dir = micro_dir, keyword_values = individual_keywords_ml)
Here 'px_description' varies for each language, and 'px_matrix' is only set for one of the languages, since it is not a language dependent keywords. For language independant keywords it doesn't matter which language the value is set for.
The filenames of the micro files are by default the name of the variable, however these can also be changed by passing a 'filename' column to 'keyword_values'
individual_keywords2 <- individual_keywords |> dplyr::mutate(filename = paste0(variable, "_2024", ".px")) # Clear folder unlink(file.path(micro_dir, "*.px")) px_micro(x2, out_dir = micro_dir, keyword_values = individual_keywords2) list.files(micro_dir)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.