kw.gbm: Calculate KW-GBM pseudo-weights
In chkern/KWML: Boosted Kernel Weighting

View source: R/weighting_functions.R

kw.gbm

R Documentation

Calculate KW-GBM pseudo-weights

Description

This function computes KW pseudo-weights using gradient tree boosting (gbm() in gbm package) to predict propensity scores.

Usage

kw.gbm(
  psa_dat,
  wt,
  rsp_name,
  formula,
  tune_idepth,
  tune_ntree,
  covars,
  h = NULL,
  krn = "triang",
  large = F,
  rm.s = F
)

Arguments

`psa_dat`	Dataframe of the combined non-probability and probability sample
`wt`	Name of the weight variable in `psa_dat` (common weights of 1 for non-probability sample, and survey weights for probability sample)
`rsp_name`	Name of the non-probability sample membership indicator in `psa_dat` (1 for non-probability sample units, and 0 for probability sample units)
`formula`	Formula of the propensity model (see `gbm()` in `gbm` package)
`tune_idepth`	A vector of values for the tuning parameter `interaction.depth` (see `gbm()` in `gbm` package)
`tune_ntree`	A vector of values for the tuning parameter `n.trees` (see `gbm()` in `gbm` package)
`covars`	A vector of covariate names for standardized mean differences (SMD; covariate balance) calculation
`h`	Bandwidth parameter (will be calculated corresponding to kernel function if not specified)
`krn`	Kernel function. "`triang`": triangular density on (-3, 3), "`dnorm`": standard normal density, "`dnorm_t`": truncated standard normal density on (-3, 3).
`large`	The cohort size is so large that it has to be divided into pieces. Default is `FALSE`.
`rm.s`	Remove unmatched survey units or not. Default is `FALSE`.

Value

A list
pswt: A dataframe including KW pseudo-weights for each setting of tune_idepth with the best setting of tune_ntree
best: Identifier for the KW pseudo-weights in pswt with the smallest SMD
smds: A vector of SMD for each set of KW pseudo-weights
p_score_c: A dataframe including propensity scores for non-probability sample units for each tuning parameter setting
p_score_s: A dataframe including propensity scores for probability sample units for each tuning parameter setting

Examples

# KW-GBM with example data
kwgbm <- kw.gbm(simu_dat, "wt", "trt",
                "trt ~ x1+x2+x3+x4+x5+x6+x7",
                tune_idepth = 1:3,
                tune_ntree = c(250, 500),
                covars = c("x1","x2","x3","x4","x5","x6","x7"))
# Select KW-GBM pseudo-weights with best covariate balance
kwgbm_w <- kwgbm$pswt[, kwgbm$best]
# Compute weighted mean of y in non-prob data
sum((simu_dat$y[simu_dat$trt == 1]*kwgbm_w)/sum(kwgbm_w))

chkern/KWML documentation built on Sept. 10, 2022, 9:49 p.m.