ps | R Documentation |
ps
calculates propensity scores using gradient boosted logistic
regression and diagnoses the resulting propensity scores using a variety of
methods
ps(
formula = formula(data),
data,
n.trees = 10000,
interaction.depth = 3,
shrinkage = 0.01,
bag.fraction = 1,
n.minobsinnode = 10,
perm.test.iters = 0,
print.level = 2,
verbose = TRUE,
estimand = "ATE",
stop.method = c("ks.mean", "es.mean"),
sampw = NULL,
version = "gbm",
ks.exact = NULL,
n.keep = 1,
n.grid = 25,
keep.data = TRUE,
...
)
formula |
An object of class |
data |
A dataset that includes the treatment indicator as well as the potential confounding variables. |
n.trees |
Number of gbm iterations passed on to |
interaction.depth |
A positive integer denoting the tree depth used in gradient boosting. Default: 3. |
shrinkage |
A numeric value between 0 and 1 denoting the learning rate.
See |
bag.fraction |
A numeric value between 0 and 1 denoting the fraction of
the observations randomly selected in each iteration of the gradient
boosting algorithm to propose the next tree. See |
n.minobsinnode |
An integer specifying the minimum number of observations
in the terminal nodes of the trees used in the gradient boosting. See
|
perm.test.iters |
A non-negative integer giving the number of iterations
of the permutation test for the KS statistic. If |
print.level |
The amount of detail to print to the screen. Default: 2. |
verbose |
If |
estimand |
|
stop.method |
A method or methods of measuring and summarizing balance across pretreatment
variables. Current options are |
sampw |
Optional sampling weights. |
version |
Default: |
ks.exact |
|
n.keep |
A numeric variable indicating the algorithm should only
consider every |
n.grid |
A numeric variable that sets the grid size for an initial
search of the region most likely to minimize the |
keep.data |
A logical variable indicating whether or not the data is saved in
the resulting |
... |
Additional arguments that are passed to |
For user more comfortable with the options of xgboost::xgboost()
,
the options for ps
controlling the behavior of the gradient boosting
algorithm can be specified using the xgboost
naming
scheme. This includes nrounds
, max_depth
, eta
, and
subsample
. In addition, the list of parameters passed to
xgboost
can be specified with params
.
Note that unlike earlier versions of 'twang', the plotting functions are
no longer included in the ps
function. See plot
for
details of the plots.
Returns an object of class ps
, a list containing
gbm.obj
The returned gbm
or xgboost
object.
treat
The vector of treatment indicators.
treat.var
The treatment variable.
desc
A list containing balance tables for each method selected in
stop.methods
. Includes a component for the unweighted
analysis names “unw”. Each desc
component includes
a list with the following components
ess
The effective sample size of the control group.
n.treat
The number of subjects in the treatment group.
n.ctrl
The number of subjects in the control group.
max.es
The largest effect size across the covariates.
mean.es
The mean absolute effect size.
max.ks
The largest KS statistic across the covariates.
mean.ks
The average KS statistic across the covariates.
bal.tab
a (potentially large) table summarizing the quality of the
weights for equalizing the distribution of features across
the two groups. This table is best extracted using the
bal.table
method. See the help for bal.table
for details
on the table's contents.
n.trees
The estimated optimal number of gradient boosted
iterations to optimize the loss function for the associated
stop.methods
.
ps
a data frame containing the estimated propensity scores. Each
column is associated with one of the methods selected in stop.methods
.
w
a data frame containing the propensity score weights. Each
column is associated with one of the methods selected in stop.methods
.
If sampling weights are given then these are incorporated into these weights.
estimand
The estimand of interest (ATT or ATE).
datestamp
Records the date of the analysis.
parameters
Saves the ps
call.
alerts
Text containing any warnings accumulated during the estimation.
iters
A sequence of iterations used in the GBM fits used by plot
function.
balance
The balance measures for the pretreatment covariates used in plotting, with a column for each
stop.method
.
balance.ks
The KS balance measures for the pretreatment covariates used in plotting, with a column for each covariate.
balance.es
The standard differences for the pretreatment covariates used in plotting, with a column for each covariate.
ks
The KS balance measures for the pretreatment covariates on a finer grid, with a column for each covariate.
es
The standard differences for the pretreatment covariates on a finer grid, with a column for each covariate.
n.trees
Maximum number of trees considered in GBM fit.
data
Data as specified in the data
argument.
Dan McCaffrey, G. Ridgeway, Andrew Morral (2004). "Propensity Score Estimation with Boosted Regression for Evaluating Adolescent Substance Abuse Treatment", *Psychological Methods* 9(4):403-425.
gbm
, xgboost
, plot
, bal.table
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.