knitr::opts_chunk$set(echo = TRUE, message = FALSE, fig.width=7, fig.height=5)
```{=html}
## Introduction The Introduction to `synthACS` briefly mentions the `split` and `combine_smsm` functionality in Sections 3.2 and 3.4 respectively. There, we note that deriving the sample synthetic micro data is a memory intensive process and advise using `synthACS` on a high performance machine. Of course, such a machine is not always available, which is when `split` and `combine_smsm` are needed. A brief illustration of these two functions is provided in this vignette. The same example data is used as in the introductory vignette: ```r library(data.table) library(acs) library(synthACS) library(retry) ca_geo <- geo.make(state = "CA", county = "*") ca_dat_SMSM <- pull_synth_data(2014, 5, ca_geo)
split()
and combine_smsm()
The split
and combine_smsm
functions are used, respectively, to reduce the computational
requirements of a large spatial microsimulation task into a set of smaller tasks and to recombine
the results. They enable the well known "split-apply-combine" strategy for Data Analysis [@plyr].
In this case, the "apply" step is intentionally performed sequentially and not inside another
function in order to minimize RAM usage and enable a garbage-collection step between intensive
in-memory function calls.
The syntax for both is straightforward:
split(<object>, n_splits= N)
combine_smsm(<object1>, <object2>, ..., <objectk>)
split
takes a larger macroASC
class object and splits it into n_splits
smaller macroACS
objects. Similarly combine_smsm
takes several smaller smsm_set
objects and combines them into a
single, larger, smsm_set
class object.
An example of this is provided below:
# split() n_splits <- 20 split_ca_dat <- split(ca_dat_SMSM, n_splits = n_splits) tmp_opts <- vector("list", length= n_splits) for (i in 1:n_splits) { # Section 3.3 of introduction: SMSM via simulated annealing # derive synthetic datasets tmp_synth <- derive_synth_datasets(split_ca_dat[[i]], leave_cores = 0) # create constraints for simulated annealing a <- all_geog_constraint_age(tmp_synth, method = "macro.table") g <- all_geog_constraint_gender(tmp_synth, method = "macro.table") m <- all_geog_constraint_marital_status(tmp_synth, method = "macro.table") r <- all_geog_constraint_race(tmp_synth, method = "synthetic") e <- all_geog_constraint_edu(tmp_synth, method = "synthetic") cll <- all_geogs_add_constraint(attr_name = "age", attr_total_list = a, macro_micro = tmp_synth) cll <- all_geogs_add_constraint(attr_name = "gender", attr_total_list = g, macro_micro = tmp_synth, constraint_list_list = cll) cll <- all_geogs_add_constraint(attr_name = "marital_status", attr_total_list = m, macro_micro = tmp_synth, constraint_list_list = cll) cll <- all_geogs_add_constraint(attr_name = "race", attr_total_list = r, macro_micro = tmp_synth, constraint_list_list = cll) cll <- all_geogs_add_constraint(attr_name = "edu_attain", attr_total_list = e, macro_micro = tmp_synth, constraint_list_list = cll) # anneal tmp_opts[[i]] <- all_geog_optimize_microdata(tmp_synth, seed = 6550L, verbose = TRUE, constraint_list_list = cll, p_accept = 0.4, max_iter = 10000L) } # create the string needed for combine_smsm(). paste0("tmp_opts[[", 1:n_splits, "]]", sep= ", ", collapse= "") # [1] "tmp_opts[[1]], tmp_opts[[2]], tmp_opts[[3]], tmp_opts[[4]], tmp_opts[[5]], # tmp_opts[[6]], tmp_opts[[7]], tmp_opts[[8]], tmp_opts[[9]], tmp_opts[[10]], # tmp_opts[[11]], tmp_opts[[12]], tmp_opts[[13]], tmp_opts[[14]], tmp_opts[[15]], # tmp_opts[[16]], tmp_opts[[17]], tmp_opts[[18]], tmp_opts[[19]], tmp_opts[[20]], " # copy and paste the resulting string, excluding the final trailing comma opt_ca <- combine_smsm(tmp_opts[[1]], tmp_opts[[2]], tmp_opts[[3]], tmp_opts[[4]], tmp_opts[[5]], tmp_opts[[6]], tmp_opts[[7]], tmp_opts[[8]], tmp_opts[[9]], tmp_opts[[10]], tmp_opts[[11]], tmp_opts[[12]], tmp_opts[[13]], tmp_opts[[14]], tmp_opts[[15]], tmp_opts[[16]], tmp_opts[[17]], tmp_opts[[18]], tmp_opts[[19]], tmp_opts[[20]])
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.