Class.sample: Samples along the class labels

View source: R/class_sample.r

Class.sampleR Documentation

Samples along the class labels

Description

Stratified sampling: sample separately within each class

Usage

Class.sample(lbls, nsam=NULL, prop=NULL, uniform=FALSE)

Arguments

lbls

Vector of labels convertable into factor

nsam

Number of samples to take from each class

prop

Proportion of samples to take from each class

uniform

Uniform instead of random?

Details

'Class.sample()' splits labels into groups in accordance with classes, and samples each of them separately. If 'prop' is specified, then number of samples in each class calculated separately from this value. Of both 'nsam' and 'prop' specified, preference is given to 'prop'.

Uniform method samples each n-th member of the class to reach the desired sample size.

If sample size is bigger then class size, the whole class will be sampled.

Class.sample() uses the ave() internally, and can be easily extended, for example, to make k-fold sampling, like:

ave(seq_along(lbls), lbls, FUN=function(.x) cut(sample(length(.x)), breaks=k, labels=FALSE))

Value

Logical vector of length equal to 'vector'

Author(s)

Alexey Shipunov

Examples


(sam <- Class.sample(iris$Species, nsam=5))
iris.trn <- iris[sam, ]
iris.tst <- iris[!sam, ]

(sample1 <- Class.sample(iris$Species, nsam=10))
table(iris$Species, sample1)
(sample2 <- Class.sample(iris$Species, prop=0.2))
table(iris$Species, sample2)
(sample3 <- Class.sample(iris$Species, nsam=10, uniform=TRUE))
table(iris$Species, sample3)
(sample4 <- Class.sample(iris$Species, prop=0.2, uniform=TRUE))
table(iris$Species, sample4)

shipunov documentation built on Feb. 16, 2023, 9:05 p.m.