hdps_screen: hdps_screen

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/hdps_screen.R

Description

The hdps_screen function performs part of step 2 (identify_covariates), steps 3 (assess_recurrence) and 4 (prioritize_covariates) of the HDPS algorithm (Schneeweiss et al., 2009).

Usage

1
2
3
hdps_screen(outcome, treatment, covars, dimension_names = NULL,
  dimension_indexes = NULL, keep_n_per_dimension = 200,
  keep_k_total = 500, verbose = FALSE, debug = FALSE)

Arguments

outcome

binary vector of outcomes

treatment

binary vector of treatments

covars

matrix or data.frame of binary covariates.

dimension_names

A character vector of patterns to match against the column names of covars to split columns into dimension groups. See details.

dimension_indexes

A list of vectors of column indexes corresponding to dimension groups. See details. Cannot be specified with dimension_names.

keep_n_per_dimension

The maximum number of covariates to be kept per dimension by identify_covariates.

keep_k_total

Total number of covariates to keep after expanding by assess_recurrence and ordering by link{prioritize_covariates}.

verbose

Should verbose output be printed?

debug

Enables some debuging checks which slow things down, but may yield useful warnings or errors.

Details

The hdps_screen function performs part of step 2 (identify_covariates), steps 3 (assess_recurrence) and 4 (prioritize_covariates) of the HDPS algorithm (Schneeweiss et al., 2009).

Step 2. Columns of covars are split by data dimension (as defined in Schneeweiss et al. (2009)) and filtered by identify_covariates.

Dimensions can be specified in two ways. If dimension_names is used, the colnames(covars) is greped for each value of dimension_names. If some column names match more than one pattern, an error is thrown. If some column names are not matched by any pattern, a warning is issued and those columns are ignored. For example, suppose the column names of covars are c("drug_1", "drug_2", "proc_1", "proc_2"). dimension_names <- c("drug", "proc") would split covars into two dimensions, one for drugs and one for procs.

Dimensions can also be specified by dimension_indexes which should contain a list of either column indexes or column names for each dimension.

If neither dimension_names nor dimension_indexes is specified, all covariates are treated as one dimension.

Step 3. After filtering, remaining covariates are expanded by assess_recurrence.

If at this point, the number of expanded covariates is less than keep_k_total, all expanded covariates are returned.

Step 4. Expanded covariates are ordered with prioritize_covariates.

Step 5. Step 5 can be performed with predict.hdps_covars.

Value

An object of class hdps_covars

Author(s)

Sam Lendle

References

Schneeweiss, S., Rassen, J. A., Glynn, R. J., Avorn, J., Mogun, H., & Brookhart, M. A. (2009). High-dimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology (Cambridge, Mass.), 20(4), 512.

See Also

predict.hdps_covars

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
set.seed(123)
n <- 1000
p <- 10000
out <- rbinom(n, 1, 0.05)
trt <- rbinom(n, 1, 0.5)
covars <- matrix(rbinom(n*p, 3, 0.05), n)
colnames(covars) <- c(paste("drug", 1:(p/2), sep="_"),
                      paste("proc", 1:(p/2), sep="_"))

dimension_names <- c("drug", "proc")

screened_covars_fit <- hdps_screen(out, trt, covars, 
                                   dimension_names = dimension_names,
                                   keep_n_per_dimension = 400,
                                   keep_k_total = 200,
                                   verbose=TRUE)
                                   
screened_covars <- predict(screened_covars_fit)

lendle/hdps documentation built on Aug. 18, 2017, 12:11 a.m.