identify_covariates: identify_covariates
In lendle/hdps: High-dimensional propensity score algorithm

Description Usage Arguments Details Value Author(s) References

Given a matrix of covarites, identify_covariates returns the top keep_n_covars or the indexes of those columns.

1	identify_covariates(covars, keep_n_covars = 200, indexes = FALSE)

`covars`	a matrix or something that can be coerced with `as.matrix` of covariates
`keep_n_covars`	number of covariates to keep
`indexes`	Should indexes be returned? Or a subset of `covars`. Defaults to `FALSE`.

Columns are sorted in descending order of min(prevalence, 1-prevalence) where prevalence is the the proportion of non-zero values in a given column.

If indexes==TRUE, a vector of the top keep_n_covars column indexes is returned.

If indexes==FALSE, a matrix of covariates is returned whos columns are the top keep_n_covars colums of covars. Columns are in their original order. If also keep_n_covars >= ncol(covar), then the function returns immediately without ranking columns in terms of prevalence as it is unecessary.

Differences from Schneeweiss et al. (2009):

Covariates that have fewer than 100 non-zero values are not automatically dropped. If typical covariates tend to have more than 100 non-zero values will typically be ranked higher than those with fewer than 100 automatically.

Indexes of identified columns or a subset of covars

Sam Lendle

Schneeweiss, S., Rassen, J. A., Glynn, R. J., Avorn, J., Mogun, H., & Brookhart, M. A. (2009). High-dimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology (Cambridge, Mass.), 20(4), 512.

lendle/hdps documentation built on May 9, 2019, 8:34 a.m.