mvcs: Multivariate Continuous Stratification

Description Usage Arguments Details Value Author(s) See Also

View source: R/mvcs.r

Description

Selects a subsample of a data frame where the distribution of values of a set of variables in the subset match those of the input data frame.

Usage

1
mvcs(data, number, variables, iter = 200)

Arguments

data

The data frame to be subsampled.

number

The number of rows of data to be returned, ie. the sample size.

variables

A vector containing the names of variables to be used for the stratification.

iter

The number of iterations to be performed to find the optimal subset.

Details

This function uses a Cramer test to select rows of an input data frame, where the distribution of values across multiple variables closely matches the distribution of input data. For example, you may have a large input data set of geographic points, where elevation is skewed towards low values, and rainfall is skewed towards high values. This function returns a subset of the data, of chosen size, where the rainfall and elevation have distributions matching those of the input data.

This function works only on continuous variables, for which a statistical distribution can be calculated. For stratification within factors, use mvs.

Value

A data frame.

Author(s)

Grant Williamson

See Also

mvs


ozjimbob/ecbtools documentation built on Jan. 18, 2021, 7:39 p.m.