setup_stratify: Setup function for stratified sampling

View source: R/sample-split.R

setup_stratifyR Documentation

Setup function for stratified sampling

Description

This function controls whether or not stratified sample splitting shall be performed. If no stratified sampling shall be performed, do not pass any arguments to this function (this is the default). If stratified sampling shall be performed, use this function to pass arguments to stratified() in the package "splitstackshape". In this case, the specification for prop_aux in GenericML() does not have an effect because the number of samples in the auxiliary set is specified with the size argument in stratified().

Usage

setup_stratify(...)

Arguments

...

Named objects that shall be used as arguments in stratified(). If empty (default), ordinary random sampling will be performed.

Details

The output of this setup function is intended to be used as argument stratify in the function GenericML(). If arguments are passed to stratified() via this function, make sure to pass the necessary objects that stratified() in the "splitstackshape" package requires. The necessary objects are called indt, group, and size (see the documentation of stratified() for details). If either of these objects is missing, an error is thrown.

Value

A list of named objects (possibly empty) specifying the stratified sampling strategy. If empty, no stratified sampling will be performed and instead ordinary random sampling will be performed.

See Also

stratified(), GenericML()

Examples

## sample data of group membership (with two groups)
set.seed(1)
n <- 500
groups <- data.frame(group1 = rbinom(n, 1, 0.2),
                     group2 = rbinom(n, 1, 0.3))

## suppose we want both groups to be present in a strata...
group <- c("group1", "group2")

## ... and that the size of the strata equals half of the observations per group
size <- 0.5

## obtain a list of arguments that will be passed to splitstackshape::stratified()
setup_stratify(indt = groups, group = group, size = size)

## if no stratified sampling shall be used, do not pass anything
setup_stratify()


GenericML documentation built on June 18, 2022, 9:09 a.m.