Derive synthetic micro datasets for a given geography.
Derive synthetic micro datasets for each sub-geography of a given set of geographic
macro data constraining tabulations. See Details... By default, micro dataset generation is run
in parallel with load balancing. Macro data is assumed to have been pulled from the US Census API
A macro dataset list: the result of
Logical, defaults to
How many cores do you wish to leave open to other processing?
list of the input macro datasets produced by
pull_synth_data and a
list of synthetic micro datasets for each geographical
subset within the specified macro geography.
In the absence of true micro level datasets for a given geographic area, synthetic datasets
can be used. This function uses conditional and marginal probability distributions (at the
aggregate level) to generate synthetic micro population datasets, which are built one constraint
at a time. Taking as input the macro level data (class
"macroACS"), this function builds
synthetic micro datasets for each lower level geographical area within the area of study.
In simplest terms, the goal is to generate a joint probability distribution for an attribute vector; and, to create synthetic individuals from this distribution. However, note that information for the full joint distribution is typically not available, so we construct it as a product of conditional and marginal probabilities. This is done one attribute at a time; where it is assumed that there is some sort of continuum of attribute dependence. That is, some attributes are more important (eg. gender, age) in 'determining' others (eg. educational attainment, marital status, etc). These more important attributes need to be assigned first, whereas less important attributes may be assigned later. Most of these distinctions are largely intuitive, but care must be taken in choosing the order of constructed attributes.
This function provides a synthetic population with the following characteristics as well as each
synthetic individual's probability of inclusion. The included characteristics are: age, gender,
marital status, educational attainment, employment status, nativity, poverty status, geographic
mobility in the prior year, individual income, and race. **Note** that these are INDIVIDUAL attributes;
they are not at the HOUSEHOLD level. Additional attributes which interest the user may be added
in a similar manner via
Birkin, Mark, and M. Clarke. "SYNTHESIS-a synthetic spatial information system for urban and regional analysis: methods and examples." Environment and planning A 20.12 (1988): 1645-1671.
1 2 3 4 5 6 7 8 9