sampleBy: Returns a stratified sample without replacement

Description Usage Arguments Value Note See Also Examples

View source: R/stats.R

Description

Returns a stratified sample without replacement based on the fraction given on each stratum.

Usage

1
sampleBy(x, col, fractions, seed)

Arguments

x

A spark_tbl

col

column that defines strata

fractions

A named list giving sampling fraction for each stratum. If a stratum is not specified, we treat its fraction as zero.

seed

random seed

Value

A new spark_tbl that represents the stratified sample

Note

sampleBy since 1.6.0

See Also

Other stat functions: approxQuantile(), corr(), covariance(), crosstab(), freqItems()

Examples

1
2
3
4
5
## Not run: 
df <- read.json("/path/to/file.json")
sample <- sampleBy(df, "key", fractions, 36)

## End(Not run)

danzafar/tidyspark documentation built on Sept. 30, 2020, 12:19 p.m.