refine: Refine initial stratification

View source: R/refine.R

refineR Documentation

Refine initial stratification

Description

Refine an initial stratification by splitting each stratum or specified subset of strata into two refined strata. If no initial stratification is provided, one is first generated using prop_strat().

Usage

refine(object = NULL, z = NULL, X = NULL, strata = NULL, options = list())

Arguments

object

an optional object of class strat, typically created using strat() or as a result of a call to prop_strat(). If not provided, z and X must be specified

z

vector of treatment assignment; only used if object is not supplied

X

covariate matrix/data.frame; only used if object is not supplied

strata

vector of initial strata assignments; only used if object is not supplied. Can be NULL, in which case an initial stratification using the quintiles of the propensity score is generated using prop_strat() and the generated propensity score is also added to the X matrix as an extra covariate

options

list containing various options described in the Details below

Details

The options argument can contain any of the following elements:

  • solver: character specifying the optimization software to use. Options are "Rglpk" or "gurobi". The default is "Rglpk" unless a gurobi installation is detected, in which case it is set to "gurobi". It is recommended to use "gurobi" if available.

  • standardize: boolean whether or not to standardize the covariates in X. Default is TRUE

  • criterion: which optimization criterion to use. Options are "max", "sum", or "combo", referring to whether to optimize the maximum standardized mean difference (SMD), the sum of all SMDs, or a combination of the maximum and the sum. The default is "combo"

  • integer: boolean whether to use integer programming as opposed to randomized rounding of linear programs. Note that setting this to TRUE may cause this function to never finish depending on the size of the data and is not recommended except for tiny data sets

  • wMax: how much to weight the maximum standardized mean difference compared to the sum. Only used if criterion is set to "combo". Default is 5

  • ist: which strata to split. Should be a level from the specified strata or a vector of multiple levels. Default is to split all strata

  • minsplit: The minimum number of treated and control units to allow in a refined stratum. Default is 10

  • threads: How many threads you'd like the optimization to use if using the "gurobi" solver. Uses all available threads by default

Note that setting a seed before using this function will ensure that the results are reproducible on the same machine, but results may vary across machines due to how the optimization solvers work.

Value

Object of class "strat", which is a list object with the following components:

  • z: treatment vector

  • X: covariate matrix

  • base_strata: initial stratification

  • refined_strata: refined_stratification

  • details: various details about the optimization that can be ignored in practice, but may be interesting:

    • valueIP, valueLP: integer (determined via randomized rounding, unless integer option set to true) and linear programming scaled objective values

    • n_fracs: number of units with fractional LP solutions

    • rand_c_prop, rand_t_prop: proportions of the control and treated units in each stratum that were selected with randomness

    • pr: linear programming solution, with rows corresponding to the strata and columns to the units

    • criterion: criterion used in the optimization (see the details about the options for the optimization)

    • wMax: weight placed on the maximum standardized mean difference in the optimization (see the details about the options for the optimization)

    • X_std: standardized version of X

Examples

# Choose 400 patients and 4 covariates to work with for the example
set.seed(15)
samp <- sample(1:nrow(rhc_X), 400)
cov_samp <- sample(1:26, 4)

# Let it create propensity score strata for you and then refine them
ref <- refine(X = rhc_X[samp, cov_samp], z = rhc_X[samp, "z"])

# Or, specify your own initial strata
ps <- prop_strat(z = rhc_X[samp, "z"],
                 X = rhc_X[samp, cov_samp], nstrata = 3)
ref <- refine(X = ps$X, z = ps$z, strata = ps$base_strata)

# Can just input the output of prop_strat() directly
ref <- refine(object = ps)


optrefine documentation built on April 19, 2023, 1:08 a.m.