kw.gbm: Calculate KW-GBM pseudo-weights

View source: R/weighting_functions.R

kw.gbmR Documentation

Calculate KW-GBM pseudo-weights

Description

This function computes KW pseudo-weights using gradient tree boosting (gbm() in gbm package) to predict propensity scores.

Usage

kw.gbm(
  psa_dat,
  wt,
  rsp_name,
  formula,
  tune_idepth,
  tune_ntree,
  covars,
  h = NULL,
  krn = "triang",
  large = F,
  rm.s = F
)

Arguments

psa_dat

Dataframe of the combined non-probability and probability sample

wt

Name of the weight variable in psa_dat (common weights of 1 for non-probability sample, and survey weights for probability sample)

rsp_name

Name of the non-probability sample membership indicator in psa_dat (1 for non-probability sample units, and 0 for probability sample units)

formula

Formula of the propensity model (see gbm() in gbm package)

tune_idepth

A vector of values for the tuning parameter interaction.depth (see gbm() in gbm package)

tune_ntree

A vector of values for the tuning parameter n.trees (see gbm() in gbm package)

covars

A vector of covariate names for standardized mean differences (SMD; covariate balance) calculation

h

Bandwidth parameter (will be calculated corresponding to kernel function if not specified)

krn

Kernel function. "triang": triangular density on (-3, 3), "dnorm": standard normal density, "dnorm_t": truncated standard normal density on (-3, 3).

large

The cohort size is so large that it has to be divided into pieces. Default is FALSE.

rm.s

Remove unmatched survey units or not. Default is FALSE.

Value

A list
pswt: A dataframe including KW pseudo-weights for each setting of tune_idepth with the best setting of tune_ntree
best: Identifier for the KW pseudo-weights in pswt with the smallest SMD
smds: A vector of SMD for each set of KW pseudo-weights
p_score_c: A dataframe including propensity scores for non-probability sample units for each tuning parameter setting
p_score_s: A dataframe including propensity scores for probability sample units for each tuning parameter setting

Examples

# KW-GBM with example data
kwgbm <- kw.gbm(simu_dat, "wt", "trt",
                "trt ~ x1+x2+x3+x4+x5+x6+x7",
                tune_idepth = 1:3,
                tune_ntree = c(250, 500),
                covars = c("x1","x2","x3","x4","x5","x6","x7"))
# Select KW-GBM pseudo-weights with best covariate balance
kwgbm_w <- kwgbm$pswt[, kwgbm$best]
# Compute weighted mean of y in non-prob data
sum((simu_dat$y[simu_dat$trt == 1]*kwgbm_w)/sum(kwgbm_w))

chkern/KWML documentation built on Sept. 10, 2022, 9:49 p.m.