split_pilot_set: Split data into pilot and analysis sets

Description Usage Arguments Value Examples

View source: R/split_pilot_set.R

Description

Exported for the convenience of the user, although in practice this process is almost always done using auto_stratify. Given a data set and some parameters about how to split the data, this function partitions the data accordingly and returns the partitioned data as a list containing the analysis_set and pilot_set.

Usage

1
2
split_pilot_set(data, treat, pilot_fraction = 0.1, pilot_sample = NULL,
  group_by_covariates = NULL)

Arguments

data

data.frame with observations as rows, features as columns

treat

string giving the name of column designating treatment assignment

pilot_fraction

numeric between 0 and 1 giving the proportion of controls to be allotted for building the prognostic score (default = 0.1)

pilot_sample

a data.frame of held aside samples for building prognostic score model.

group_by_covariates

character vector giving the names of covariates to be grouped by (optional). If specified, the pilot set will be sampled in a stratified manner, so that the composition of the pilot set reflects the composition of the whole data set in terms of these covariates. The specified covariates must be categorical.

Value

a list with analaysis_set and pilot_set

Examples

1
2
3
4
  dat <- make_sample_data()
  splt <- split_pilot_set(dat, "treat", 0.2)
  a.strat <- auto_stratify(splt$analysis_set, "treat", outcome ~ X1,
   pilot_sample = splt$pilot_set)

raikens1/stratamatch documentation built on Aug. 6, 2020, 7:29 a.m.