getSamples: A function to extract small samples that maintain important...

Description Usage Arguments Value Examples

Description

This function returns a sample extracted from the supplied population data, that has a similar distribution to the supplied population dataset. The function is called by guessStartval() to estimate inital values for numerical optimization procedures, but can also be used directly to reduce the sample size such that computationally intensive models can be estimated on a representative sample of an entire dataset. The function makes use of var.test() to compute an F test for the ratio of sample/population variance, and t.test() to compare their means.

Usage

1
2
getSamples(data, share = 0.25, confidence.alternative = 0.9,
  max.iter = 50, tracelevel = 1, memorymanagement = TRUE)

Arguments

data

The population data from which a sample needs to be taken.

share

The size of the sample in terms of the share of the population data. Defaults to .25.

confidence.alternative

The confidence level used in the F and t-tests defined as the probability level at which the alternative is accepted. For confidence.alternative = .9, we need less evidence to accept the alternative hypothesis that the samples are unequal than at confidence.alternative = .95, hence .90 is stricter than .95.

max.iter

The maximum number of draws to be taken. The programm breaks either when a suitable sample is found or when max.iter is reached.

tracelevel

Similar to a verbose statement. Should information be printed during execution? defaults to 1 for printing. set to 0 for no printing.

memorymanagement

TRUE/FALSE indicating whether garbage collection should be forced using tgc(). Defaults to TRUE. Recommended setting for large datasets.

Value

A sample of the population dataset that has significantly similar means and variances, or a message indicating that no suitable dat

Examples

1
getSamples (data = ITdata, share = 0.025, confidence.alternative=0.90, max.iter =100)

BPJandree/AutoGLM documentation built on May 5, 2019, 10:25 a.m.