subsample: subsample

Description Usage Arguments Value Note Author(s)

View source: R/subsample.R

Description

Internal function to execute the subsampling component of the stochastic stagewise approach. If a user provides a stochastic value between 0 and 1, it is assumed that some proportion of subsampling is desired. The samplingDistCalculation function calculates the distribution of the clusters and the subsample function uses that distribution to draw the actual subsample.

Usage

1
subsample(sampleDist, sampleSize, withReplacement, clusterIDs, clusterID)

Arguments

sampleDist

A vector whose length is equal to the number of clusters that indicates the probability of sampling each cluster

sampleSize

A scalar value indicating how larger of a subsample is being drawn

withReplacement

A logical value indicating whether the subsampling is beign done with or without replacement

clusterIDs

A vector of all of the UNIQUE cluster IDs

clusterID

A vector of length equal to the number of observations indicating which cluster each observation is in

Value

A list with two variables: subSampleIndicator, which indicates which observations are in the current subsample, and clusterIDCurr, which indicates the clusterID for the subsample.

Note

Internal function.

While most of the subsample can be determined from the subSampleIndicator, the clusterIDCurr value has to be constructed inside the subsample function as the way the cluster IDs is handled is different depending o n whether we are sampling with or without replacement.

Author(s)

Gregory Vaughan


sgee documentation built on May 1, 2019, 7:10 p.m.