sample_balance | R Documentation |
Sample balancing processing with a k-factor.
sample_balance(data = data, ID = ID, targstr = targstr, dweights = rep(1, nrow(data)), cap = T, typeofcap = 1, capval = 6, floor = F, floorval = -Inf, eps = 0.01, rounding = F, klimit = F, klimitval = Inf, maxiter = 20000, random = FALSE, out = NULL)
data |
Data to be sample balanced. |
ID |
Unique ID of the dataset. |
targstr |
Targ structure returned by the function |
dweights |
Weight in the data. Default: rep(1,nrow(data)) |
cap |
Cap flag. If yes, a cap will be applied in sample balancing process. Default: True. |
typeofcap |
Type of the cap. if 1, then cap by n*mean; if 2, then cap by mean +/-n*sd. n is assigned by capval. Default: 1 (mean cap). |
capval |
The cap value. If mean cap is used, during each iteration, when new weights are larger than |
floor |
Floor flag. Indicate if a floor value will be used in sample balancing process. Default: False. When no cap used, this option is recommended to be TRUE. |
floorval |
Floor value. During each iteration, when new weights are smaller than the floorvalue, they will be adjusted up to floorvalue. Default: -Inf. |
eps |
Episilon. The threshold used in iteration algorithm. Algorithm converges when the diff is less than eps. Default: 0.01 |
rounding |
Round flag. Indicate if the weights will be rounded in sample balancing process. Default: FALSE. |
klimit |
Klimit flag. Indicate if a klimit value will be used in sample balancing process. Default: FALSE. |
klimitval |
Klimit value. When Klimit value is applied, before iteration, two new vector will be created: |
maxiter |
Max number of iteration in sample balancing process. Default: 20000. |
random |
Randomization flag. If yes, data will be randimized first before sample balancing process. Default: FALSE. |
out |
The filename of the spreadsheet which contains the output information of sample balancing. The file is saved in current working directory. If NULL, no output file is generated. |
Return a list with 2 elements. The first element contains sample balancing information such as convergence, iterations, DEFF, capval, stat efficiency et al. The second element is a dataframe with two columns, ID and new balanced (capped) weights.
data <- read_sav("OnlineWgts.sav") targ <- read.xlsx("targ.xlsx") targstr <- sample_balance_init(data=data, targ=targ) # Cap using 6*means results.sb <- sample_blance(data = data, ID = "BOOK_ID", targstr = targstr, dweights = wgt, cap = T, typeofcap = 1, capval = 6, floor = T, floorval = 50, eps = .01, rounding = F, klimit = F, klimitval = Inf, out = "out.xlsx")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.