Optimize the candidate micro dataset such that the lowest loss against the
macro dataset constraints is obtained. Loss is defined here as total absolute error (TAE)
and constraints are defined by the constraint_list
. Optimization is done by
simulated annealing–see details.
1 2 3 4 5 6 7 8 9 10 11 12  optimize_microdata(
micro_data,
prob_name = "p",
constraint_list,
tolerance = round(sum(constraint_list[[1]])/2000 * length(constraint_list), 0),
resample_size = min(sum(constraint_list[[1]]), max(500,
round(sum(constraint_list[[1]]) * 0.005, 0))),
p_accept = 0.4,
max_iter = 10000L,
seed = sample.int(10000L, size = 1, replace = FALSE),
verbose = TRUE
)

micro_data 
A 
prob_name 
It is assumed that observations are weighted and do not have an equal probability
of occurance. This string specifies the variable within 
constraint_list 
A 
tolerance 
An integer giving the maximum acceptable loss (TAE), enabling early stopping. Defaults to a misclassification rate of 1 individual per 1,000 per constraint. 
resample_size 
An integer controlling the rate of movement about the candidate space.
Specifically, it specifies the number of observations to change between iterations. Defaults to

p_accept 
The acceptance probability for the Metropolis acceptance criteria. 
max_iter 
The maximum number of allowable iterations. Defaults to 
seed 
A seed for reproducibility. See 
verbose 
Logical. Do you wish to see verbose output? Defaults to 
Spatial microsimulation involves the study of individuallevel phenomena within a specified set of geographies in which these individuals act. It involves the creation of synthetic data to model, via simulation, these phenomena. As a first step to simulation, an appropriate microlevel (ie. individual) dataset must be generated. This function creates such appropriate microlevel datasets given a set of candidate observations and macrolevel constraints.
Optimization is done via simulated annealing, where we wish to minimize the total absolute error
(TAE) between the microdata and the macroconstraints. The annealing procedure is controlled by
the parameters tolerance
, resample_size
, p_accept
, and
max_iter
. Specifically, tolerance
indicates the maximum allowable TAE between the
output microdata and the macroconstraints within a given max_iter
allowable iterations
to converge. resample_size
and p_accept
control movement about the candidate space.
Specfically, resample_size
controls the jump size between neighboring
candidates and p_accept
controls the hillclimbing rate for exiting local minima.
Please see the references for a more detailed discussion of the simulated annealing procedure.
Ingber, Lester. "Very fast simulated reannealing." Mathematical and computer modelling 12.8 (1989): 967973.
Metropolis, Nicholas, et al. "Equation of state calculations by fast computing machines." The journal of chemical physics 21.6 (1953): 10871092.
Szu, Harold, and Ralph Hartley. "Fast simulated annealing." Physics letters A 122.3 (1987): 157162.
1 2 3 4 5 6  ## Not run:
## assumes you have micro_synthetic object named test_micro and constraint_list named c_list
opt_data < optimize_microdata(test_micro, "p", c_list, max_iter= 10, resample_size= 500,
p_accept= 0.01, verbose= FALSE)
## End(Not run)

