prepareData: prepareData

Description Usage Arguments Value Author(s) Examples

Description

Prepares the data for simulation

Usage

1
2
3
4
5
6
prepareData(census, survey, census_area_id = 1, survey_id = 1,
  convert = TRUE, use_base = TRUE, census_categories = FALSE,
  survey_weights = FALSE, survey_categories = FALSE,
  reference_col = FALSE, group = FALSE, na.rm = FALSE, breaks = FALSE,
  pop_benchmark = FALSE, du_benchmark = FALSE, building_benchmark = FALSE,
  align = FALSE, pop_total_col = FALSE, verbose = FALSE)

Arguments

census

Census data of small areas.

survey

A survey of individual records (microdata).

census_area_id

(optional, default=1) row name or row index with area id in the census data. Define as 'FALSE' if area code should be generated.

survey_id

(optional, default=1) individual records id's. Define as 'FALSE' to generate an id.

convert

(optional, default=TRUE) Converts data to binary format.

use_base

(optional, default=TRUE) use the model.matrix function form base R.

census_categories

(optional, default=FALSE) row names or row index of with categories to be used in the simulation.

survey_weights

(optional, default=FALSE) row name or row index of initial weights in the survey data. 'FALSE' will use the last column.

survey_categories

(optional, default=FALSE) survey categories to be used in the simulation.

reference_col

(optional, default=FALSE) Category used as reference.

group

(optional, default=FALSE) Used variable to run an integrated re-weighting simulation.

na.rm

(optional, default=FALSE) remove records with nan values.

breaks

(optional, default=FALSE) define the beaks to calculate population totals, if FALSE population totals won't be computed

pop_benchmark

(optional, default=FALSE) define the benchmark to be used for the computation of the total population, pass as a vector/ containing the breaks of the benchmark (e.g. pop_benchmark=c(1,5)). If FALSE the function will compute total population as the mean of the all benchmarks.

align

(optional, default=FALSE) align values to population totals

pop_total_col

(optional, default=FALSE) col containing the population totals

verbose

(optional, default=FALSE) be verbose

pop_du

(optional, default=FALSE) define the benchmark to be used for the computation of total dwelling units. Analog to pop_benchmark

pop_building

(optional, default=FALSE) define the benchmark to be used for the computation of total building units. Analog to pop_benchmark

Value

X Prepared survey matrix.

Tx Marginal totals for simulation area.

dx Survey design weights.

area_id Small area ID.

total_pop mean population totals for each area

X_complete binary formatted survey with all all categories

Tx_complete marginal sums with all categories

Author(s)

M. Esteban Munoz H.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
data("GREGWT.census")
data("GREGWT.survey")

simulation_data <- prepareData(GREGWT.census, GREGWT.survey,
                               census_categories=seq(2,24),
                               survey_categories=seq(1,3))

simulation_data1 <- prepareData(GREGWT.census, GREGWT.survey,
                                census_categories=seq(2,24),
                                survey_categories=seq(1,3),
                                pop_benchmark=c(2,12),
                                verbose=TRUE)

# compute the total population as the mean of all benchmarks. Breaks parameters
# needs to be defined. In this case the breaks are displaced by one because the
# area code is on the first column.
simulation_data2 <- prepareData(GREGWT.census, GREGWT.survey,
                                census_categories=seq(2,24),
                                survey_categories=seq(1,3),
                                breaks=c(11, 17),
                                verbose=TRUE)

total_pop1 <- simulation_data1$total_pop
plot(total_pop1$pop)
total_pop2 <- simulation_data2$total_pop
points(total_pop2$pop, col="red", pch="+")

emunozh/GREGWT documentation built on May 16, 2019, 5:11 a.m.