iterake: Iterative raking procedure

Description Usage Arguments Value Examples

View source: R/iterake.R

Description

This function creates row-level weights using an iterative raking algorithm based on targets from a known population (established with universe()). The weights are appended as a new column in the data. If iterake() converges, the weighted marginal proportions of the sample will match those set in universe(). Summary statistics of the weighting procedure are presented by default.

Usage

1
2
3
iterake(universe, wgt.name = "weight", max.wgt = 3,
  threshold = 1e-10, max.iter = 50, stuck.limit = 5,
  permute = FALSE, summary = TRUE)

Arguments

universe

Output object created with universe() function.

wgt.name

Name given to column of weights to be added to data, default is "weight", optional.

max.wgt

Maximum value weights can take on, default is 3, optional. The capping takes place prior to applying expansion factor (if N is set in universe().

threshold

Value specifying minimum summed difference between weighted marginal proportions of sample and universe before algorithm quits, default is 1e-10, optional.

max.iter

Value capping number of iterations for the procedure, default is 50, optional.

stuck.limit

Value capping the number of times summed differences between sample and universe can oscillate between increasing and decreasing, default is 5, optional.

permute

Boolean indicating whether to test all possible orders of categories in universe and keep the most efficient (TRUE) or to test categories in the order listed in universe only (default, FALSE), optional. Note that when TRUE this will increase runtime by a factor of (number of categories)!.

summary

Whether or not to display summary output of the procedure, default is TRUE, optional.

Value

Data frame with the resulting weight variable appended to it.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
data(demo_data)

iterake(
    universe = universe(
        data = demo_data,
        
        category(
            name = "Sex",
            buckets = factor(
                x = levels(demo_data[["Sex"]]),
                levels = levels(demo_data[["Sex"]])
            ),
            targets = c(0.4, 0.5),
            sum.1 = TRUE
        ),

        category(
            name = "BirthYear",
            buckets = c(1986:1990),
            targets = rep(0.2, times = 5)
        ),
    
        category(
            name = "EyeColor",
            buckets = c("brown", "green", "blue"),
            targets = c(0.8, 0.1, 0.1)
        ),
    
        category(
            name = "HomeOwner",
            buckets = c(TRUE, FALSE),
            targets = c(3/4, 1/4)
        )
    )
)

ttrodrigz/iterake documentation built on July 1, 2020, 7:46 a.m.