sample_balance: Sample balancing processing with a k-factor.
In yangx227/SimmonsResearchR: An implementation of R functions used in Stat team.

sample_balance

R Documentation

Sample balancing processing with a k-factor.

Description

Sample balancing processing with a k-factor.

Usage

sample_balance(data = data, ID = ID, targstr = targstr,
  dweights = rep(1, nrow(data)), cap = T, typeofcap = 1, capval = 6,
  floor = F, floorval = -Inf, eps = 0.01, rounding = F, klimit = F,
  klimitval = Inf, maxiter = 20000, random = FALSE, out = NULL)

Arguments

`data`	Data to be sample balanced.
`ID`	Unique ID of the dataset.
`targstr`	Targ structure returned by the function `sample_balance_init`.
`dweights`	Weight in the data. Default: rep(1,nrow(data))
`cap`	Cap flag. If yes, a cap will be applied in sample balancing process. Default: True.
`typeofcap`	Type of the cap. if 1, then cap by nmean; if 2, then cap by mean +/-nsd. n is assigned by capval. Default: 1 (mean cap).
`capval`	The cap value. If mean cap is used, during each iteration, when new weights are larger than `capvaluemean`, they will be adjusted back to `capvaluemean`. If std cap is used, during each iteration, when new weights are larger than `mean+capvaluestd(weight)`, they will be adjusted back to `mean+capvaluestd(weight)`. when new weights are smaller than `mean-capvaluestd(weight)`, they will be adjusted up to `mean-capvaluestd(weight)`. Default: 6.
`floor`	Floor flag. Indicate if a floor value will be used in sample balancing process. Default: False. When no cap used, this option is recommended to be TRUE.
`floorval`	Floor value. During each iteration, when new weights are smaller than the floorvalue, they will be adjusted up to floorvalue. Default: -Inf.
`eps`	Episilon. The threshold used in iteration algorithm. Algorithm converges when the diff is less than eps. Default: 0.01
`rounding`	Round flag. Indicate if the weights will be rounded in sample balancing process. Default: FALSE.
`klimit`	Klimit flag. Indicate if a klimit value will be used in sample balancing process. Default: FALSE.
`klimitval`	Klimit value. When Klimit value is applied, before iteration, two new vector will be created: `maxneww=dweights*klimitval` and `minneww=dweights/klimitval`. During each iteration, when new weights are larger than maxneww, they will be adjusted back to maxneww, when new weights are lower than minneww, they will be adjusted up to minneww. This option works similarly as the Cap option. Default: -Inf.
`maxiter`	Max number of iteration in sample balancing process. Default: 20000.
`random`	Randomization flag. If yes, data will be randimized first before sample balancing process. Default: FALSE.
`out`	The filename of the spreadsheet which contains the output information of sample balancing. The file is saved in current working directory. If NULL, no output file is generated.

Value

Return a list with 2 elements. The first element contains sample balancing information such as convergence, iterations, DEFF, capval, stat efficiency et al. The second element is a dataframe with two columns, ID and new balanced (capped) weights.

Examples

data <- read_sav("OnlineWgts.sav")
targ <- read.xlsx("targ.xlsx")

targstr <- sample_balance_init(data=data, targ=targ)

# Cap using 6*means
results.sb <- sample_blance(data = data,
                            ID = "BOOK_ID",
                            targstr = targstr,
                            dweights = wgt,
                            cap = T,
                            typeofcap = 1,
                            capval = 6,
                            floor = T,
                            floorval = 50,
                            eps = .01,
                            rounding = F,
                            klimit = F,
                            klimitval = Inf,
                            out = "out.xlsx")

yangx227/SimmonsResearchR documentation built on April 24, 2022, 6:44 a.m.