sample_balance: Sample balancing processing with a k-factor.

Description Usage Arguments Value Examples

Description

Sample balancing processing with a k-factor.

Usage

1
2
3
4
sample_balance(data = data, ID = ID, targstr = targstr,
  dweights = rep(1, nrow(data)), cap = T, typeofcap = 1, capval = 6,
  floor = F, floorval = -Inf, eps = 0.01, rounding = F, klimit = F,
  klimitval = Inf, maxiter = 20000, random = FALSE, out = NULL)

Arguments

data

Data to be sample balanced.

ID

Unique ID of the dataset.

targstr

Targ structure returned by the function sample_balance_init.

dweights

Weight in the data. Default: rep(1,nrow(data))

cap

Cap flag. If yes, a cap will be applied in sample balancing process. Default: True.

typeofcap

Type of the cap. if 1, then cap by n*mean; if 2, then cap by mean +/-n*sd. n is assigned by capval. Default: 1 (mean cap).

capval

The cap value. If mean cap is used, during each iteration, when new weights are larger than capvalue*mean, they will be adjusted back to capvalue*mean. If std cap is used, during each iteration, when new weights are larger than mean+capvalue*std(weight), they will be adjusted back to mean+capvalue*std(weight). when new weights are smaller than mean-capvalue*std(weight), they will be adjusted up to mean-capvalue*std(weight). Default: 6.

floor

Floor flag. Indicate if a floor value will be used in sample balancing process. Default: False. When no cap used, this option is recommended to be TRUE.

floorval

Floor value. During each iteration, when new weights are smaller than the floorvalue, they will be adjusted up to floorvalue. Default: -Inf.

eps

Episilon. The threshold used in iteration algorithm. Algorithm converges when the diff is less than eps. Default: 0.01

rounding

Round flag. Indicate if the weights will be rounded in sample balancing process. Default: FALSE.

klimit

Klimit flag. Indicate if a klimit value will be used in sample balancing process. Default: FALSE.

klimitval

Klimit value. When Klimit value is applied, before iteration, two new vector will be created: maxneww=dweights*klimitval and minneww=dweights/klimitval. During each iteration, when new weights are larger than maxneww, they will be adjusted back to maxneww, when new weights are lower than minneww, they will be adjusted up to minneww. This option works similarly as the Cap option. Default: -Inf.

maxiter

Max number of iteration in sample balancing process. Default: 20000.

random

Randomization flag. If yes, data will be randimized first before sample balancing process. Default: FALSE.

out

The filename of the spreadsheet which contains the output information of sample balancing. The file is saved in current working directory. If NULL, no output file is generated.

Value

Return a list with 2 elements. The first element contains sample balancing information such as convergence, iterations, DEFF, capval, stat efficiency et al. The second element is a dataframe with two columns, ID and new balanced (capped) weights.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
data <- read_sav("OnlineWgts.sav")
targ <- read.xlsx("targ.xlsx")

targstr <- sample_balance_init(data=data, targ=targ)

# Cap using 6*means
results.sb <- sample_blance(data = data,
                            ID = "BOOK_ID",
                            targstr = targstr,
                            dweights = wgt,
                            cap = T,
                            typeofcap = 1,
                            capval = 6,
                            floor = T,
                            floorval = 50,
                            eps = .01,
                            rounding = F,
                            klimit = F,
                            klimitval = Inf,
                            out = "out.xlsx")

yangx227/SimmonsResearchR documentation built on Oct. 5, 2017, 4:09 p.m.