geoweight: Split weights across geographies

Description Usage Arguments Details Value Examples

View source: R/geoweight.r

Description

geoweight calculates state weights for each household in a microdata file that add up to the household total weight, such that weighted state totals for selected characteristics hit or come close to desired targets

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
geoweight(
  wh,
  xmat,
  targets,
  dweights = get_dweights(targets),
  betavec = rep(0, length(targets)),
  method = "LM",
  maxiter = NULL,
  optlist = NULL,
  quiet = TRUE
)

Arguments

wh

Household weights, 1 per household, numeric vector length h. Each household's geography weights must sum to its household weight.

xmat

Data for households. Matrix with 1 row per household and 1 column per characteristic (h x k matrix). Columns can be named.

targets

Targeted values. Matrix with 1 row per geographic area and 1 column per characteristic. If columns are named, names must match column names of xmat. Rownames can be used to identify geographic areas. If unnamed, rows will be named geo1, geo2, ..., geo_s

dweights

Difference weights: weights to be applied to Weighting factors for targets (h x k matrix).

betavec

optional vector of initial guess at parameters, length s * k; default is zero for all

method

optional parameter for approach to use; must be one of c('LM', 'Broyden', 'Newton'); default is 'LM'

maxiter

maximum number of iterations; integer; defaults vary by method: LM (default): 200 Broyden: 2000 Newton: 200

optlist

list of options that will update nelsqv or nls.lm options respectively

quiet

c(TRUE, FALSE) FALSE is default; TRUE provides newlsqv or nls.lm output

Details

geoweight uses the solver nleqslv or the solver nls.lm depending on user choice.

The default method, LM, uses nls.lm as it appears to be the most robust of the methods, rarely failing and often producing a better optimum than Broyden or Newton. However, in some circumstances one of the latter may work better. It is hard to define guidelines for when a particular method will be better. The Broyden method can be faster or more robust than the Newton method but generally requires many more iterations than the Newton method, although iterations will be faster.

Value

A list with the following elements:

h

number of households (or individuals, records, tax returns, etc.)

s

number of states (or other geographies or subgroups)

k

number of characteristics each household has

solver_message

message from the solver that was used

etime

elapsed time

beta_opt_mat

s x k matrix of optimal parameters

whs

h x s matrix of state weights for each household, computed using the optimal parameters

wh

the input vector of household total weights, length h

xmat

matrix of data for households, h x k

dweights

optional vector of weighting factors for targets, length s * k

output

list of output from the solver that was used

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# Example 1: Determine state weights for a simple problem with random data
p <- make_problem(h=10, s=3, k=2)
dw <- get_dweights(p$targets)

res1 <- geoweight(wh = p$wh, xmat = p$xmat, targets = p$targets,
  dweights = dw)

res2 <- geoweight(wh = p$wh, xmat = p$xmat, targets = p$targets,
  dweights = dw, method = 'Newton')

res3 <- geoweight(wh = p$wh, xmat = p$xmat, targets = p$targets,
  dweights = dw, method = 'Broyden')

res1
res2
res3
c(res1$sse_unweighted, res2$sse_unweighted, res3$sse_unweighted)

# verify that the state weights produce the desired targets
t(res2$whs) %*% p$xmat
p$targets

donboyd5/microweight documentation built on Aug. 17, 2020, 4:48 p.m.